Looks like the Great Firewall or something like it is preventing you from completely loading www.skritter.com because it is hosted on Google App Engine, which is periodically blocked. Try instead our mirror:

legacy.skritter.cn

This might also be caused by an internet filter, such as SafeEyes. If you have such a filter installed, try adding appspot.com to the list of allowed domains.

Online Tool to Break up text into vocab (by freq)?

icebear   December 9th, 2011 5:05p.m.

Hi all, I was just thinking how nice it would be to have a tool(s) which allowed me to:

1) Copy and paste an article in
2) Return a list of the words in the text
3) Sort the words by frequency
3.a) Ideally, some way to compare against those words already added in Skritter

I recall @Mandarinboy mentioning working on something like this a month or so ago (Skritter integration of some sort), but until something that fancy is ready I was wondering if anyone knows site(s) that I could use for the first 3 objectives?

My goal is to feed in a article, decide how foreign/unknown the vocabulary is, then load (some) new/frequent words into Skritter for study before reading the text later that day or week. I find reading texts much more enjoyable after studying the vocabulary (as opposed to the opposite).

Cheers!

Netbrian   December 10th, 2011 3:35a.m.

Hi! For Japanese, I found a word frequency generator here -- http://forum.koohii.com/viewtopic.php?pid=132000#p132000 .

What I do is make a spreadsheet or Access database of my Skritter vocabulary, and use that to filter out what I already know from my frequency output. It actually works quite well!

icebear   December 10th, 2011 3:40a.m.

Thanks, but I'm looking for something for Chinese, and that will work online or with Mac OSX...

Dennis   December 10th, 2011 5:15p.m.

Apache Lucene will break Chinese text into words. It is written in Java so should run on a Mac with little or no change. I've worked with it before. 1 and 2 would be easy.

Lucene is at http://lucene.apache.org/

This forum is now read only. Please go to Skritter Discourse Forum instead to start a new conversation!