For any digital crowdsourced dictionary to be useful and viable, step one is to have an open license to prevent contributors from being turned off by the thought of contributing their efforts to what might eventually just end up as commercial as Urban Dictionary. This project makes no mentioning of licensing at all though. That shouldn't be an afterthought.
>For any digital crowdsourced dictionary to be useful and viable
I'm pro open licenses, but you're overstating things. The sheer usefulness and viability of Urban Dictionary attests to the fact that most people don't actually care about licenses.
> The sheer usefulness and viability of Urban Dictionary attests to the fact that most people don't actually care about licenses.
Urban Dictionary is indeed extremely useful, but I think making this out to be evidence that people don't care is oversimplifying things. As useful as it is, it's still pretty clear that the licensing (along with a few other lacks, open moderation, etc.) severely holds it back from being as comprehensive & accurate a source as it could be. It achieves usefulness through sheer volume: the vast majority of entries on it are nonsense, there's just so many that there are diamonds to be found in the rough.
Throwing messages in conversations is not the same as contribute to curate a linguistic data base, both in term of cognitive load and easiness to lucratively despoil contributors.
I don't think folks should rush to infer stuff from this. There's a kind of dogma about open licenses which often undermines the very efforts proponents say it supports.
Another way to say that is it might be unwise to let ideological peer pressure coerce you into an open license, under the guise of serving some imagined superset of users who "really want" and "otherwise can't use" that, while ignoring the actual market segment who would happily use your product commercially, and the benefits that such restrictive protection provide.
Open source can be abused to coerce newcomers to surrender any competitive advantage they might compound by using such "imaginary market" and "ideological purity" arguments, when in reality it may just be a tool of incumbents to weaken any competition, by eroding a startup's ability to capture value.
Or in a pithy form: "Go open source! It's bad if you don't" said the Big Bad Wolf Corp to the Little Piggy Startup. "Oh yes, Mr Wolf, anything to please!" said the Little Piggy Startup, trusting the Big Wolf, because, who would want to hurt Little Piggy?
Be cautious before you change anything and consider many options! If your instinct is to stick to stronger protections, then it might be wise to do that!
I was already looking for a bug tracker to fix the swallowed newlines and blank lines after submitting... I'm so used to open source now, it somehow surprises me not to be able to find at least a public bug tracker even if I can't directly submit a simple fix.
One can share "feedback" via Google Docs, though :') yeah, no thanks
Now that you point out the data license, yeah, this project will definitely be dead within two years. No sustainable license and not interesting enough to revisit daily like Mastodon, so it will neither build a community nor a sustainable dataset. It's simple enough that part time support will let it survive in the background for a while, but colour me surprised if the author hasn't fully moved on come 2026
I would be more concerned about accuracy, bias and curation sooner than I would think about the license in case of a community-built dataset. Take a website such as TasteAtlas, for example. They frequently publish rankings of best/worst foodstufs, but their data on that is garbage (even if you skip the fact that taste is highly subjective). Not only are there lots of incorrect assumptions made (e.g. a dish always belongs to only one national cuisine), but also the popularity data is not curated at all, so they end up with situations such as the first place in my country on the "100 Best Food Destinations" list was a tiny village with a truck stop.
I think that's what the parent is saying. Are the site's users unpaid labor? That was a thing corporations tried to do when back when urban dictionary was starting, but in today's climate working for a corporation for free doesn't have the cachet it used to.
You are absolutely right! I've been making updates to the homepage and this was an oversight, I will fix it as soon as possible. Thanks for pointing it out!
For what it is worth, you can visit the explore page through the menu or at untranslatable.co/explore
Three are some quite obviuous errors, but the creator writes she has "learned to program from scratch in order to create this website". I think it's a good start.
I have been having some javascript issues that I need to work on (I am currently traveling), but you can find slang by language by simply typing the language or country in the search bar, or when you have found an entry by clicking on the language or country you are interested in! Hope this helps, and I'll try to fix these issues asap.
Very interesting they were funded partially from Kickstarter! 292 backers at 10k€. I assumed you needed quite the following for Kickstarter to work...
And it looks like they do. 49k followers on Facebook and 16k on Instagram. Not sure how far back these go, but looks like very "shareable" content, where they would take I translatable words and make little funny pictures or memes or other intriguing things and post them. Lots of interaction comments/reaction-wise
Timeline-wise this was backed on Kickstarter in 2020. Site launched in summer 2020. The creator was very active on Kickstarter working on communicating and updating the community with what was going on (until the end there).
Sounds more like the community of people in this interest group/hobby had an interest and she was just stewarding their desires. And her and the communities values ended up aligning. Like a contractor/contractee relationship
My first sentence was legitimate, she’s convinced people to invest in her idea - as you say - and I’m impressed. Perhaps we disagree on where the investment has gone, but hey ho. If you think the site as it stands represents €10k of value then we disagree on that too.
The second sentence was somewhat tongue in cheek (and I’m not a fan of KS in general, been bitten too many times). Well done on jumping straight to name calling/insults though, great way to hold yourself during public discourse.
Thank you so much for this overview - I had no idea I was featured on awesome-linguistics!
I actually had a bit of a following on some language-themed meme pages and used that to launch this project on Kickstarter. The Kickstarter was partially to gain some funds (although after rewards, shipping, and taxes it really wasn't a great amount) and partially to see if people were interested in the idea at all, and it turns out they were.
So this is international, but is the aim for it to be used by an international audience? e.g. if a Brazilian were lamenting the fact that UrbanDictionary is in English, is this site for them?
Because right now when I explore the entries, all the definitions are in English. Is this the intent? So it's for English people to find out what international slang words are?
I feel like it would be good to be really clear about your audience.
Maybe one option would be to allow the writing of definitions in multiple languages. Then a user could look up a word, and see all the definitions and find the definition in their language.
The fact that it’s in English doesn’t preclude its use by the non-English.
For example, I (a Dutch person) just found out about some Brazilian and Panamanian slang. Presumably, a Brazilian or Panamanian person could also find out about Dutch expressions (I saw some Dutch entries).
So the target audience, to me, is quite clear: people from around the world, who are interested in other languages and cultures.
It is indeed for an international audience - language learners or people who are generally interested in learning about other languages and cultures.
There are various slang dictionaries for specific languages or countries, but as far as I could tell there were none taking slang from different languages and explaining it to foreigners.
Adding definitions in different languages would make the moderation process quite a bit more complicated, and it would also defeat the purpose of explaining slang to an external audience.
Cool but why does clicking both “see what people have added” and “add an entry” link me to the same page to add an entry? Is there no way to view the collection without adding an entry?
During my last trip to Fujian China, I became fascinated by the Mandarin word that sounds like, "schma." My wife was born there, and she had told me it meant, "what." But this time, I was really listening to how the word was being used, and it occurred to me, it was being used somewhat like "donc" in French. (Which my French teacher in high school told me was untranslatable.)
When I realized, that, my reaction was, "What!?"
After our trip, I became obsessed with the word for receipt: "fapiao." I thought my inability to tell the difference in the tones was hilarious, but now she refuses to say the word in front of me at all! This is also hard for me to understand. As far as I'm concerned, she's pretty much just saying, "ma" to me 4 times in a row! There's a difference, but it's very difficult! Why is my befuddlement and amusement at that so annoying? There's this Taiwanese comedienne who was talking about being annoyed at her western boyfriend not being able to tell, so apparently that's a thing.
This is curious to me. As a Westerner and speaker of a non-tonal language that started learning Chinese already being an adult, I never had any difficulty telling the difference between the tones. I did have difficulty with other sounds of the language (e.g. telling "x" from "sh" or "q" from "ch") - hell, I even have difficulty with English sounds, for example, to this day I need to pay an inordinate amount of attention to tell "eyes" from "ice" and there's no way I'll successfully pronounce them differently in regular real-time conversation. But the Mandarin tones? I find the difference obvious.
I wonder if being a music aficionado, having played an instrument, etc. helps with the tones.
for example, to this day I need to pay an inordinate amount of attention to tell "eyes" from "ice"
I just find that obvious. Wherever I've been in the US, "eyes" is drawn out, but "ice" is short. (Think of imitating a southern accent for "eyes.") The only people I've ever heard saying "eyes" so it sounds like "ice" are native German speakers and other Europeans. (My wife learned German for her Chinese/German comparative lit degree, so the way she says certain things in English sounds menacing to me like how a WWII German movie character says, "We have ways of making you talk!" In particular, when she says, "Your handwriting...looks like WORMS!")
I wonder if being a music aficionado, having played an instrument, etc. helps with the tones.
I've been playing traditional music for almost 40 years, and I even qualified to compete at what's basically the world competition once. I think it matters most what one got used to as a child.
> The only people I've ever heard saying "eyes" so it sounds like "ice" are native German speakers and other Europeans
In German, many syllable-final consonants (in particular, "s") are always voiceless, but the English plural "-s" is voicet (which strangely enough none of my English teachers ever botheret to mention) so if you apply German phonological rules to "eyes" you get something that sounts identical to "ice". But Germans hafe no problem pronouncing "eyes" if the "s" is not syllable-final, e.g. in "Eisen" ("eyesn", iron).
To learn to speak a languache like a natife, you neet to break habits of thought you didn't even know you hat.
Isn’t the point of difference that, eyes had a little slide in the first vowel, ‘ah-I ss’ and maybe a more voiced z like s
And ice just has extended one ‘ah- s’ a and a more interdental friction s
No, you're not alone. Approaching Mandarin I expected tones to be some sort of big hurdle. They're not. They're largely obvious. It's a part of the vowel, basically. Linguistically speaking, there's very little distinction between a vowel and a tone - it's part of how you make the vowel. And tone and vowel quality interact in a complex way, which means you're hearing changes in the vowel, along with the pitches involved. As you mentioned, at the start I was more likely to mix up x and sh than I am to mishear a tone.
Other languages use tone, of course, they just don't use it lexically to distinguish words. I also have played an instrument, etc. but I don't know if that's a factor or not.
Now, pronouncing the tones is a whole other question. My own Mandarin has like 2.5 tones instead of 4, and I struggle to apply tone contours to long phrases without messing up everything involved. Both English and Mandarin have tone contours (and a lot of them are even the same, for example, slowly rising with a sharp rise over the last few syllables = question) but the tone contours of Mandarin interact with the lexical tones of a word. Something we don't have to worry about in English. I doubt I'll ever get enough practice to make that automatic.
Approaching Mandarin I expected tones to be some sort of big hurdle. They're not. They're largely obvious. It's a part of the vowel, basically. Linguistically speaking, there's very little distinction between a vowel and a tone - it's part of how you make the vowel.
Both a Korean teacher of mine and an old housemate (who was a native Russian speaker and had a degree in French) pointed out to me that Americans are "lazy" (that is the technical term, I gathered) about how they use vowels. We get dipthongs confused with pure vowels. Unless it's pointed out to us, we don't think of how we say "oh" as containing an element of "w" at the end.
And tone and vowel quality interact in a complex way, which means you're hearing changes in the vowel, along with the pitches involved.
Ah ha! I think you just helped me! I hadn't been thinking of these two together!
the tone contours of Mandarin interact with the lexical tones of a word. Something we don't have to worry about in English.
Tone of voice is diabolically subtle, the way British and American speakers use it. About half the time, we're using it to indicate the opposite or almost opposite meanings of words. My wife from Fujian doesn't think of speech in quite the same way. We got into an argument, because she kept shouting, "BE CAREFUL!" every time someone cut me off in downtown SF traffic. It took me awhile to understand that she was just frightened and was telling me to be careful. ("HOW COULD YOU TWIST SUCH A TENDER EXPRESSION OF CARE!?" -- Which she said in that tone of voice.)
Tones don't occupy the same part of my brain as parts of vowels. It's more like a musical soundtrack accompanying the dialog.
My Japanese teacher said that, for her, the English words “ear” and “year” are indistinguishable.
I see how that could be, the words are very close, the ‘y’ sound is very brief, but it helped me understand how something that is so clear for a native speaker could be very difficult for a foreigner to hear.
> During my last trip to Fujian China, I became fascinated by the Mandarin word that sounds like, "schma." My wife was born there, and she had told me it meant, "what."
That would be 什麼 shén me, pronounced exactly as you described
Do you have trouble understanding intonated sentences in English too? You can change the meaning of this question pretty significantly by placing a rising intonation on each word:
You stole his pen?
_You_ stole his pen? -> It was you who stole his pen?
You _stole_ his pen? -> You in fact stole his pen, it was not given willingly?
You stole _his_ pen? -> You stole his pen, and not someone else's?
You stole his _pen_? -> You stole his pen, and not something else of his?
Do you have trouble understanding intonated sentences in English too? You can change the meaning of this question pretty significantly by placing a rising intonation on each word
Read my other comments. You'll find I'm already talking about this!
Ok, so it's totally not a rising intonation to me! Wherever you have the _x_, it's emphasis. "Rising" is about the most confusing way to describe it, from my point of view as a layperson.
Also, as I point out elsewhere, the way you are talking about "intonated" sentences is more akin to an accompanying soundtrack, where there's a "stinger" played at the moment something significant is said. It's not a part of the vowel!
EDIT: Okay, I've been working this out saying it to my self. There's nothing "rising" in "you" when I say "_You_ stole the pen?" The volume doesn't rise through the word. The pitch doesn't rise through the word. The only thing that rises in the whole sentence is the pitch approaching the last part of the sentence to indicate a question.
If "rising" is the actual terminology, then that's the most misleading terminology I can think of!
Maybe "rising" was not 100% correct. You're right in that you can place emphasis without the pitch of the word actually going up (instead you can make the pitch go down, or flat even, and say it with a bit more force & space afterwards). Playing with pitch is just the most obvious example of intonation I can think of in English.
I'll admit I also have musical training, so it makes sense _to me_ to think of pitch in these terms. Maybe it doesn't to most people.
Like saying "therefore" might mean "I accept the premise".
I basically never hear that as a standalone reply in English!
As an interjection or at the beginning of a sentence, "donc", like "so", can mean "now then", "that said", "moving on" or "without further ado".
I'm not sure, but I think sometimes when my wife is saying, "schma" it's somewhat like "donc" you describe above. But sometimes, I think she expressing agreement with a person that a third party was a bit mistaken or off the mark, so it's expressing a kind of disagreement. Not sure.
Cool idea, but definitely needs some more content - the "explore" page shows 103 pages with 20 entries each = ~2060 entries, which is a good start, but not really comprehensive I would say. Seems to have had a flurry of activity in 2020, then nothing, then a single new expression added this month?
This is because I don't think the same thing being added multiple times is a huge issue. The same word can have multiple meanings, slightly altering meanings, or some people are more extensive in their description than others. I think that is a feature of a crowd-sourced dictionary - you get to see how people describe it instead of one curated description. At the end of the day, it's up to the user to see which description they find most compelling, helpful or useful.
You are right, the website needs way more entries. I work on this project while also working full-time, but one of my biggest wishes has been to invest more time into getting people to add more entries.
That being said - entries are not organized by the latest addition, so there are more entries from the past few years.
Hi! I am Amarens, the creator of Untranslatable and I just wanted to thank you for sharing my project on here! It has given me a big spike in visitors and new contributions, and I just think it's very cool you wanted to share this project with other people. THANK YOU!
I just searched for non-english slang on UD and found two definitions instantly. I think it's safe to say that the internationalism is at least not prohibited.
Urban language is vulgar. Seeing “inclusivity” as a value gives me pause to the usefulness of this endeavor.
As I’ve gotten older (many years past high school now), I care less about finding and sharing vulgar terms with friends. But, when I come across language on social media that I don’t understand, I want a guide. Fail me that, and I won’t return.
Pull the ten most common (but missing) definitions from Urban Dictionary daily and provide a clean room definition. That would serve as a good litmus for the sorts of censorship we could expect.
"Inclusivity" doesn't mean "we don't define swear words". In fact it's specifically explained here as meaning "allows entries in any language, from any dialect".
Also, I bet running urbandictionary.com is an absolute moderation nightmare. Imagine that, and now it's in thousands of languages. I wish the authors the best of luck.
I was daydreaming (just a really dumb daydream) about a 'universial language' that would include all the words from every known language, just few minutes ago before I discovered this on HN. haha.
I think it's really neat since there are thousands of foreign words that cannot be translated into English or Korean (Languages I speak), and it takes a quite of time just to really understand what those words mean.
i particularly like that hacking the URL query parameters is apparently the only option for navigating the country and language categories, but those query parameters are at the end of the URL, usually past the edge of the URL bar field
they're past the end because the first parameter is a giant "authenticity token" base64 blob. you'd think this is maybe important, but removing it doesn't appear to affect the request at all
This is pretty neat. I expected it to immediately have descended into the same hell as Urban Dictionary, but it actually seems useful. I appreciate that it tells me who uses a term and where, and I especially like being able to click on those labels to explore more terms in the same category—I wish there were a way to see all those categories and choose between them (if there is, it isn't obvious to me).
The Italian ones are real, but a bit less 'urban dictionary' than phrases that have been around for a while. Probably just needs more contributors though.
When browsing my native language, I found that many of the words were in fact not untranslatable or urban in any way, but rather common words like "computer" that have been formed from roots different than english.
So Urban Dictionary clone, but English seems to be the second language in this case.
Is licensing the draw or the differentiator? I don't think the message on that is clear.
saw the expression "que sopa?" and thought about all the verlan french rappers brought into the french language. most of them will never make it into the french dictionary.