In the old times of Windows 1.0 back
in the 1980s there was a tool called Word Frequency that came with the MS
Word distribution package. As someone who uses English as a second language I
used it heavily, because it helped me to improve my vocabulary and to correct
misspellings beyond the capacity of the available spelling checkers.
That MS Word add-on created a list of all the words in a document, ordered
by frequency. It made it easy to detect overuse and/or abuse of a certain word
or expression. The little used words were also of help, because sometimes I
wrote Thomson instead of Thompson, car instead of cart, or similar errors that
the spelling checker does not detect.
Frequency analysis can also be
used as a means to establish the signature of a certain author, the
cultural level of the writer, its use of slang or technical jargon, and other
writing features. It is possible to extrapolate the number of words used in a
certain text to the total vocabulary of a person. Frequency analysis can accuse
some writers to have the vocabulary of a 10-year-old. Or the word-richness of a
Chinese-born 2nd year English student.
Frequency analysis combined
with a synonym dictionary, as provided in currently available
synonymizer software, can help writers to enrich its lexicon and
avoid abuse of certain expressions.
It is also a means to avoid
producing identical text for those who need to make its text different from a
source. For instance, a web content writer that needs to fill many similar but
not identical pages, and students who want to avoid plagiarism detection and
accusation. Rightly or wrongly.
Plagiarism detection also makes use of
frequency analysis, because comparison of a given text with the whole Web
contents is a major task, and the detection system does not know where to look
and where to start. Thus, analysing the word frequency can give some clue on
the writing style and the authorship of a given text, without indexing the
whole thing.
Search engines use word frequency to establish the
subject of web pages. They developed complex linguistic analysis in order to
classify pages by subject without human intervention. In turn, webmasters do
the same, to try to fool search engines into assigning high keyword relevance
to the pages they create. For instance, using a word with a 3% frequency gives
a text good relevance on that word (or keyword, in a search engine context). A
10% frequency is still OK, but it is close to keyword stuffing, a
technique used by webmasters who try to force their websites into the top
places of the search engines. Keyword stuffing is penalized by the search
engines, and needs to be prevented by smart use of synonyms. Either with
synonymizer software or good writing skills.
This article, for
instance, has the following Word Frequency :
word : 9, frequency : 7,
used : 6, not : 6, search : 6, text: 6, engines: 6, analysis: 5, can: 5, use: 5
... ...
I could have edited the text after the analysis, to avoid
intensive use of word and frequency for linguistic
purposes. However, it is OK for Search Engine Optimization purposes (attempting
to make this article more findable by Google and Yahoo).
Are there any
serious writers that still avoid the use of a wired computer? Probably not many
can avoid using the Web and the search engines to find the correct word, the
most used expression, to perform spelling or grammar checking. Checking word
usage in Google is faster and more efficient than using a dictionary, either in
paper, disc or the Web. The search engines list every word ever written, not
only the well-written words as dictionaries do.
Be prepared to have
your texts analysed for word frequency, educational level, plagiarism,
technicality, jargon usage and other parameters, in addition to old-fashioned
spelling.
According to these tendencies, the ultimate challenge for a
job candidate would be to write an essay with paper and pen. Most of us are not
prepared to pass such a test.
I expect not to see synonymized versions
of this article...
About the Author:
John Tello works
for a company that makes the Synonymizer software, placed at
http://www.Synonymizer.com , which is evolving into a more
complex machine-aided writing tool. He once passed the TOEFL (Test of English
as a Foreign Language), before the PC age.




