Sample/Example sentences wiki style

雅各   April 29th, 2010 6:13p.m.

I know there are probably more than one or two people that like to find sample sentences when they find new words yes? It is part of learning a new word...

I wonder if we make the sentence feature wiki style. Allow people to add and vote on example sentences (: I dare say we would end up with a lot of sample sentences fairly quickly?

lennier61   April 29th, 2010 8:16p.m.

nice idea, useful as well

nick   April 29th, 2010 8:25p.m.

Contribute them to the Tatoeba project. I will build in integration with them in the next couple months, so any sentences that are added there will also show up here (and probably in other projects, too). Right now, they have 7000+ Chinese sentences and tons more for other languages.


This project is really awesome and everyone should help it out! You could even practice your translation skills by picking Chinese sentences which don't have translations in your native language and adding some.

mike_thatguy   April 29th, 2010 11:58p.m.

Wow, what a cool project!

雅各   April 30th, 2010 12:39a.m.

Wow interesting! There seems to be no way to jsut search for all sentences with a particular chinese word, end up with a whole pile of japanese mixed in (:

Tried to download the source csv file but i get an error :/

gregshap   April 30th, 2010 7:43a.m.

In the short term, what about adding tatoeba to the magnifying glass look up?
It looks like their urls have a pretty standard GET format:


Where _CHARACTERS_ is whatever chinese characters you want to look up.

Sysko   April 30th, 2010 9:45a.m.

(Spoiler I'm part of Tatoeba dev team)
@xkfowboa: when searching, have you precise "from" language as chinese?

雅各   April 30th, 2010 11:37a.m.

I didn't want to specify a from language, as I would rather search all chinese sentences, not just the chinese sentences that are the "From" sentence so to speak.

It would be neat if the CSV download link worked then you could just ctrl-f the text (:

jww1066   April 30th, 2010 11:46a.m.

@sysko: it's an awesome project, good luck! When will you support the "to" field in search queries?

Trang   April 30th, 2010 12:23p.m.

@xkfowboa, if you specify "from Chinese", it will do what you want. That is to say, it will return ALL the CHINESE (Mandarin) sentences with the characters you searched. There is no such thing as 'chinese sentences that are the "From" sentence'.

Here's a rough idea of how the system works.
- We have a table that lists all the sentences (call it 'sentences').
- We have another table that says which sentence is translation of which other sentence (call it 'links').
- When you search "from Chinese", the system will retrieve from the 'sentences' table all the Chinese sentences that match your search.
- Then it will use the 'links' table to retrieve the various translations that exist for each sentence matching your search.
- And if (in the future, when we integrate this feature back) you specify "to English", it will only keep the Chinese sentences that have a translation in English.

I hope it was clear enough :)

Also, the downloads links are broken but we'll be updating the page tomorrow :) Just note that if you decide to download the file, you will get ALL the sentences we have, not just the Chinese ones.

Trang   April 30th, 2010 12:27p.m.

@jww1066, can't say for sure when the "to" field will be supported. Could be two weeks, one month or two months. But probably not later than that because I really want this feature back :P

雅各   May 2nd, 2010 7:40a.m.

I was excited until I discovered it doesn't support Traditional chinese characters :(

jww1066   May 2nd, 2010 9:41a.m.

No traditional?!?!?!?! That seems like a huge oversight...

雅各   May 2nd, 2010 10:04a.m.

Well to be precise it seems to not distinguish between traditional and simplified but a quick search shows:

% grep "什么" sentences.csv | wc
224 822 11819
% grep "什麽" sentences.csv | wc
0 0 0

Sysko   May 2nd, 2010 11:54a.m.

@xkfowboa : in fact in traditional chinese it is 甚麼 so that's explained why ;-)

we support traditional chinese

but the detection is made on the fly on the website
http://tatoeba.org/eng/sentences/show/342709 have a 漢 (traditional)
http://tatoeba.org/eng/sentences/show/342843 have a 汉 (simplified)
but in grey you have the equivalent in the other script

and is not put on the csv as the csv contains sentences in other languages, so we don't have a field to specify if the sentence is traditional or not as it's not relevant for the other languages

雅各   May 2nd, 2010 5:36p.m.

Interesting, I have never seen 甚麼 written anywhere!

Sysko   May 2nd, 2010 5:41p.m.

My dictionnary lists it as a variant of 什麼
Google when searching 甚麼 give a lot of results, but seems it's more oriented from content in relation with Hongkong, and it's true our contributors in traditional Chinese is from this city.

雅各   May 2nd, 2010 11:54p.m.

aah that explains it, I spend most of my time reading Taiwanese websites not Hong Kong websites. The Hong Kong based websites are not that helpful because often the chinese is actually Cantonese not Mandarin.

In my limited experience Cantonese speakers tend to phrase things a bit differently and use different chinse characters/words than what is commonly used in taiwan and mainland china

pts   May 3rd, 2010 8:46a.m.

麽 is the character chosen by the Mainland to be the traditional form of 么. The correct way to write it in Taiwan is 麼.

The last sentence for the entry 麼 in the Kanxi Zidan (康熙字典) is like this, “下从幺。俗作么,誤。” Literally, this means, “The lower component is 幺. Commonly written as 么, wrong.” Moreover, the character 麽 is not listed in the Kanxi Zidan.

