Looks like the Great Firewall or something like it is preventing you from completely loading www.skritter.com because it is hosted on Google App Engine, which is periodically blocked. Try instead our mirror:

legacy.skritter.cn

This might also be caused by an internet filter, such as SafeEyes. If you have such a filter installed, try adding appspot.com to the list of allowed domains.

Organising words by frequency (Mandarinboy?)

bennyboyk   June 20th, 2012 10:58p.m.

I've noticed a couple of lists created by Mandarinboy which focus on word frequency, one on movies, the other by newspapers.

I'd like to make a couple of lists myself based on a few books which I'm reading and was wondering how you did it, did you use a special kind of software?

Also, I'd like to download the subtitles of a couple of 电视剧 that I'm watching right now for the same reason. Does anyone know any good websites where you can find the subtitles?

Hope you can help!
Thanks

Roland   June 20th, 2012 11:29p.m.

Have a look here for subtitles: http://www.shooter.cn/

I use it with a small extension in Word (written by myself) to get rid of the time stamps and extra lines. I'm onlt learning simplified, so I also use the word translation function to convert traditional to simplified.

Roland   June 21st, 2012 12:45a.m.

bennyboyk, I was just thinking about this. An alternative could be to use byzanti's reader for such a word file and then import the unknown words into skritter: http://www.skritter.cn/forum/topic?id=178865689&comments=30
Unfortunately, I don't have a MAC, still working on Windows, but I'm waiting for Byzanti's IOS version to be used on an IPad.
What would be really cool, if such an application could be integrated into Skritter. I hope, with the new sample sentences, we would be able to have customized sample sentences, so for e.g. to import such a sentence for an unknown word from the subtitles.

Byzanti   June 21st, 2012 5:05a.m.

I'll be adding frequency information into the next version. So you'll be able to paste in a text, and get the words you don't know in the text listed in order of frequency in a corpus (thanks to Scott for the corpus!) or by HSK.

There's also this which looks pretty useful, although I'm not sure if it contains frequency info other than HSK: http://www.zhtoolkit.com/posts/2011/09/new-software-chinese-word-extractor .

Roland, the iPad version isn't a definite yet. I need to do much more work on the OSX version first before I even look into it! Been busy with job interviews and assessment centres these few weeks though, so progress is a little slow at the moment...

范博涵   June 21st, 2012 5:11a.m.

Roland,

I have had a vision of a Windows 8 PC/tablet/phone app that would essentially do the same as Skritter and Pleco combined, plus offer reading (optional licensed graded readers) and listening (movie, TV series and talk radio clips at various levels) practice in an integrated approach. But I am too busy learning Chinese and it would probably take until the release of Windows 9 before I would have the time to develop something like this. By that time I expect that Skritter will have evolved considerably. :-)

Roland   June 21st, 2012 8:53a.m.

Byzanti, thanks, I know. I would be more interested to see the unknown words than the frequency list; push them into Skritter, would save me a lot of time (I've learned a lot of words already, so frequency list is really of less concern for me, but certainly more important to others).
范博涵, yes, I have the same problem, I want to spend my time on learning Chinese and not on other things. I hope, that Skritter can now with the IOS app attract more users and as such gives them the economic base to develop Skritter further into such a direction. Such an integrated tool would be really great.

Mandarinboy   June 21st, 2012 9:48a.m.

There are many free software on the net that do sort of that. I did use my own software. I did clean that up but lost my PC so now i have to get back to Sweden to get hold of an backup of the source code. What most of those programs do is basically to segment the text to pick out the words. After that you can e.g compare that to your skritter list to get the words you do not know and how frequent they are in your text. Or, as I did, find texts having enough known words for my level. Etc. I am also using shooter for most of my subtitles needs but for my scanning i used files i did get from CCTV. I think it would be great if Skritter could provide something like this in the future. Technically speaking it is very simple but there is much that can be done to the logic of the segmenter. In most cases a segmenter only gives you the longest matching series of character it find in its dictionary. I have tried to add some more logic so it also tries to look at grammar and use of personal names etc. For your need i do however think that a basic segmenter will do the trick. I will be back for a short visit to Sweden shortly and will then pick up the source code. I try to have that hosted on the net so you and other can download and use it if you like to. It can be used for frequency, skritter matching, web spider crawling, word popup etc.

bennyboyk   June 24th, 2012 1:26a.m.

Sounds rather technical to me. Do let us know when you plan to make a trip back to Sweden. I'd be interested in trying to make a few lists of my own! :-)

This forum is now read only. Please go to Skritter Discourse Forum instead to start a new conversation!