shever73 4 days ago

A nice synchronicity here, I was only checking Māori words today because The Guardian's cryptic crossword was set by "Pangakupu" (which means, logically enough, "crossword"). This crossword setter always includes a hidden Māori word or phrase in the puzzle.

mydriasis 4 days ago

I see you've posted about Maori stuff a couple of times. I want to congratulate you, this is really, really great. Thank you for working to preserve a language and culture! You're presenting resources that are tough to find, and that's an amazing thing.

hk__2 4 days ago

I get "Sorry, something went wrong. If this error persists, contact us." every time I type something.

  • firstbabylonian 3 days ago

    Thanks — there was a cookie-related bug, which should now be resolved.

ks2048 4 days ago

I can't type anything in the text area on Firefox. Works in Chrome (macOS).

  • XeO3 3 days ago

    Yeah, that element should be 'textarea' instead of 'div' or at least the 'contenteditable' should be true.

  • joemi 4 days ago

    Also doesn't work in Firefox on Windows but does in Chrome.

timonoko 2 days ago

Is this true that Maaori is crapped by Ænglish spelling? In all other languages long vowel is just two wovels, not some stupid umlaut on top.

  • TRiG_Ireland 16 hours ago

    It's certainly not an umlaut. Nor yet is it a trema, which is what you probably mean. It's a macron, which is commonly used to mark long vowels.

    • timonoko an hour ago

      Sort-of. Because In Anglo world "aa" is "ä". Even ChatGPT thinks that it ok to use "AA" when making a Finnish morse generator.

      In hindsight Maaori is not so bad. Some American Indian writing systems are just pronunciation quides for Anglos (or French). I tried to study Haida some 30 years ago, but it was too complex and miserable, because there was no actual audio clips available at that time.

  • timonoko 2 days ago

    Yes, says ChatGPT:

    The Māori word "Māori" can be transcribed into the International Phonetic Alphabet (IPA) as:

    /ˈmaːɔɾi/

    Here’s a breakdown:

      /ˈ/ – indicates primary stress on the first syllable
      /m/ – a voiced bilabial nasal, like the "m" in "man"
      /aː/ – a long open front unrounded vowel, similar to the "a" in "father," but held longer (the macron indicates length)
      /ɔ/ – a mid-open back rounded vowel, like the "o" in "thought"
      /ɾ/ – a tapped or flapped "r," similar to the quick "r" sound in Spanish "pero"
      /i/ – a close front unrounded vowel, like the "ee" in "see"
      This transcription represents the most common pronunciation of the word "Māori."
stephantul 4 days ago

This is very nice and important. We need more tools for small languages.

neallindsay 4 days ago

excellent use of a Punycode domain

  • yardstick 4 days ago

    I was going to disagree with you, because most kiwis have no idea how to write the special o (myself included), so they’d end up typing toreo.nz instead.

    Which as it turns out, redirects to xn--treo-l3a.nz anyway.

    Nice!

    • lostlogin 4 days ago

      > kiwis have no idea how to write the special o (myself included)

      I’m in New Zealand too. I work in MRI and have to type ‘TE’ (echo time) regularly, as well as the Māori word ‘te’.

      Whatever secret sauce Apple sprinkles into iOS is actually malignant and it takes about 3 edits to type te/TE whenever I try.

      • nicoburns 3 days ago

        Yeah, Apple's autocorrect implementation is shockingly bad. Android is much better in this regard.

        • williamdclt 3 days ago

          I’m 2y into having an iPhone, generally liking it better but autocorrect alone keeps me on the edge of switching back to android, that’s how bad it is. It’s not even about comparing to android, it’s just that iOS is bad. If I had a bigger phone I’d turn it off entirely but on a mini it’s juuuust useful enough that it makes me want to throw my phone against the wall _less_ than if it was off.

          I honestly want to have a coffee chat with the PM in charge of autocorrect at apple, I need to understand what the hell they are thinking!

          • lostlogin 3 days ago

            I type what I want then add a sacrificial letter at the end, then delete it, then carry on.

            - I just tested and this is the way, but it still took me a couple of tries due to it thinking it screwing with the capitalisation.

    • mkl 4 days ago

      The ō is an o with a macron. It's pretty easy to install a keyboard layout that supports it: https://kupu.maori.nz/about/macrons-keyboard-setup. Many mobile keyboards support it by default with long presses to pop up an accent/variant chooser.

      • lmm 4 days ago

        > It's pretty easy to install a keyboard layout that supports it

        Only if you don't need anything else from your keyboard layout. I use Dvorak and need to type Japanese, and I think either of those makes it impossible to enter macrons on Windows.

        • mkl 4 days ago

          You can have multiple keyboard layouts installed, and it takes a fraction of a second to switch with a shortcut key (on Windows it's Win-space). I have a Māori keyboard layout installed on my work computer, but I only switch into it to type words with macrons, then switch back (I use ` more often and don't like having to double-press it).

    • EdwardDiego 4 days ago

      I'm a fan of MacOS for making it real easy to type vowels with umlauts / macrons etc.

      • samatman 4 days ago

        Unfortunately the macron is the one missing dead-key accent on the US "ABC" layout. It's easy enough to hit the globe key when this comes up, but it annoys me a bit that Opt-y is ¥, and Shift-Opt-Y is Á, which is a duplicate: Opt-e-A will also produce it. I'd be happier if Opt-y was the macron dead key and Shift-Opt-Y took over for ¥: I can go a year without needing the Yen symbol, but it makes sense to have it. I don't think the English layout needs two ways to type Á though, it's excessive.

        • EdwardDiego 4 days ago

          I just hold down the vowel, then hit 9 for the macron.

        • _zoltan_ 4 days ago

          a proper ő is also missing unfortunately

scanny 4 days ago

Awesome work, love to see the effort on the technical front of bringing a language into broader use!

pabs3 3 days ago

Is the source code of this somewhere?

addaon 4 days ago

Slightly off-topic, but it would be nice if HN interpreted punycode in link descriptions. Especially given that the links go through a redirect, which means that the browser status bar sees them as part of the query and not the domain, so the browser's own interpretation of punycode never gets applied.

  • zahlman 4 days ago

    Seeing the Punycode link is actually a security feature, because it means you aren't tricked into visiting, say, pple-06g.com (apple with a Cyrillic a).

    • smallerize 4 days ago

      There are conventions around that. https://chromium.googlesource.com/chromium/src/+/main/docs/i... Generally, if all the characters are from one script, then it is decoded. There are lots of exceptions detailed there, but it's harder to make a homoglyph attack work using only characters from one script to impersonate another.

      • dmurray 3 days ago

        That's not a convention, it's a specification for how Google Chrome does it.

        And it's not even a full specification. Several of its 13 steps link to other documents that need to be read to implement the spec fully. Step 12 refers to a list of "dangerous patterns" which appears only to exist in the Chromium source. Step 5 refers vaguely to "any characters used in an unusual way".

        It's not OK to say that because Chromium does it, it's some internet standard that random website maintainers should implement.

        • smallerize 3 days ago

          I think you're ignoring the conversation. There is a lot of discussion to be had, and we don't have to say that decoding punycode is a security risk and simply do without. I also said "conventions" specifically to avoid meaning that these are hard-and-fast rules. And Firefox does something pretty similar. https://wiki.mozilla.org/IDN_Display_Algorithm#Algorithm

  • samatman 4 days ago

    Someone always says this when a punycode link shows up.

    I'm glad they don't. What you see? That's the link. It's what the browser sends, it's what DNS resolves: it's the link. Displaying it as Unicode is just a display option, and it's one which opens up all manner of mischief through confusables.

    It's a hacker culture choice, and it's one I appreciate.

    • TRiG_Ireland 4 days ago

      On the other hand, that's a rather ango-centric viewpoint.

      • samatman 4 days ago

        It is! So kind of you to notice. Perhaps you could also notice that English is the language used on Hacker News.

        I'm quite sure a website centered in a different cultural landscape might choose a different convention. Good for them, I say.

        If URLs start being Unicode, and not an ASCII encoding which is sometimes displayed as Unicode, that would be a different story. But that's not how things are.

  • lpapez 4 days ago

    You can easily write a Tampermonkey Userscript for that. As HN doesn't update the CSS that often, should be quite low-maintenance solution.