Who is this website for?
- Curious people who want to find words that have something special. For example: words ending in "u" or the longest words in English.
- English language learners who want to view how certain words are formed. We firmly believe that remembering a set of rules is pretty hard, but when you visualize those rules with plenty of examples they stick to your memory much better. For example: the plural of words ending in "o" or the list of words with irregular plurals.
- English teachers who need examples to teach the rules and more importantly, their exceptions. For example: or adverbs that do not end in "-ly".
- Teachers of phonetics who want to find examples of words that are pronounced in a certain way. For example: words in which the "h" is not pronounced or words with the tripthong "eia".
- Linguists and researchers trying to reach certain conclusions based on real data to formulate new theses. For example: statistics on how many syllables words usually have or what are the most used vowels in English.
- Software Engineers trying to develop new solutions based on Natural Language Processing who need raw data on the language. For example: full conjugated list of verbs or a full list of proverbs in English.
How was it made?
We managed to obtain a list of words from the free dictionary Wiktionary.org. The data there is barely structured and it's manually entered by the users. The administrators make sure that they do it in an orderly way, but there isn't a unified format. That represented a real challenge for us because this site needed words to be highly organised in order to be able to create lists that match a certain criterion.
So, we downloaded a full dump of Wiktionary from here and parsed it into our internal database. It took two months of engineering work to get something decent. Particularly difficult was to parse the phonetic transcriptions because they were written in different formats and sometimes the system wasn't obtaining the best transcription available.
We enhanced the data from Wiktionary as much as possible trying to do advanced stuff like counting the syllables and determine how "frequent" those words were so that most common words could be presented before very rare ones.
What do we offer?
We've create a large amount of lists that we think might be interesting for the Linguistic community. Some of them were pretty straightforward and some of them took a lot of work. If you think that a list is missing, you can request it, we will read your comments and will consider if it would be useful for the community and we can add it easily.
As you might have noticed, our lists always typically show a maximum of 1,000 items even though they actually can have many more items. For each list we set a maximum that we considered reasonable for most users. If you need the full list, you can purchase it easily and it will be sent to you in your email immediately.
We also offer the possibility of downloading our whole database of words. It took a really lot of work to obtain it and we can give it away for free because anyone may copy it and resell it. Anyone interested in getting that database will find it pretty cheap, because obtaining it manually would cost much more.