Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
On-device browser translations with Firefox Translations (ctrl.blog)
312 points by zaik on July 10, 2022 | hide | past | favorite | 84 comments


I believe this is called the Bergamot project, more can be found here: https://browser.mt/

The GitHub repo for it is here: https://github.com/browsermt/bergamot-translator

The repo contains some details about how to run it in WASM which is quite interesting for embedding it in pages. I've been playing around with using WASM to capture speech to text (https://github.com/ccoreilly/vosk-browser) and automatically translating it using Bergamot.

Results have been, ok. I don't think the tech is quite there yet and the speech to text obviously struggles with multiple speakers.


I believe there is also a standalone online demo of the translation engine here: https://mozilla.github.io/translate/


Why use a browser at all when the same funded project has a separate native app that exists for this: [0]

[0] https://translatelocally.com


Because if I am viewing a site that I want translated I don't want to copy it into a different app and lose all of the formatting.


that is super cool


This seems to be almost completely funded by EU Horizon 2020 Research & Innovation programme[0].

[0]: https://cordis.europa.eu/project/id/825303


Whoa - an EU Horizon thing that actually produced something kinda useful to citizens rather than just being a vehicle to shovel taxpayers money into whoever can produce the biggest pile of paperwork? Wow!


Horizon 2020 has funded a ton of projects, you might want to do some reading before you get deeper into your unsubstantiated opinions.

Here is a list of ~17000 projects funded by Horizon 2020 (H2020 + H2020-EU.5 programme, with at least 1,000,000 EUR funding by EU): https://cordis.europa.eu/search?q=contenttype%3D%27project%2...

Of course, lot of those projects will go nowhere (and therefore seen as "waste" by some), that's bound to happen with any large scale program like this. But unless you make bets on a lot of projects, you won't get results like this submission.


I know a lot of people who apply for them... The vast majority of the projects everyone involved knows will shut up shop mere days after the funding ends. People buy equipment simply because they know they'll get to keep it after the project has folded. The whole team then pivots to the next project. There is zero chance of success of any of them - it's always more profitable to close down and use a different idea for the next round of fundings.

Maybe some of the projects will go somewhere, but there are a good chunk where even the project leads set off from the outset with the goal of getting as much EU money as possible and closing down as quickly as possible when the money stops.


> I know a lot of people

> The vast majority of the projects everyone

How many people are we talking about? 100? 1000?

If there is on average 20 people involved with each project who received funding, that would be 340,000 people in total. If you know 1000 people who are doing this scam, then you know 0.2% of the people involved in these projects.

To extrapolate that all the projects are similar scams to what your acquaintances are doing, seems a bit irrational.

But even so, that you know 1000 people, 100 people or even 10 people performing identical scams of government funds, seems unlikely, but even so, that you would know so many who are scammers, seems to say more about you than anything about the Horizon 2020 programme.


"I know a lot of people who apply for them..."

Maybe you should complain to your corrupt friends instead who are stealing from the government.


Abuse exists in pretty much any aspect of procurement, from building roads with inflated contractor prices to producing "analysis" from consultants who happen to be friends with the relevant minister. The private sector is not immune either, with plenty of shareholders money being funneled to the sales guy with the best golf routine.

Part of the solution is better law enforcement, which would include whistleblowing on your corrupt friends. Part is better oversight - and here you should bring it up with your national government, since EU authorities typically have little power to impose that once money has been disbursed to local entities.


Yeah I have that impression also of EU grants. There seems to be no touch point "on the ground" to see if people actually need what's being subsidized.

Where I lived in Ireland there was a "park" with some huge signs about how the EU was so generous to fund it.. But they failed to make an opening in the perimeter fence for people to actually enter. The paths just ended at a fence.

It was also totally useless as a park because it was a thin strip right next to a busy 4-lane road. You could hop the fence but it's useless trying to relax right beside a main road. Total cash grab from some real estate developer clearly.


This is really exciting. I absolutely hate the rare occurrences where I have to use chrome / google translate just to buy something from an amazon store that does not have an english version. Like most EU amazons or country specific stores.

Having the ability to harness all this power locally would be awesome as both a developer and user. Big thanks to the team, whoever you are. I like the choice of languages. Not quite the usual suspects hehe


Most of Amazon’s regional websites offer machine-translation (although the language selection varies by region). Click the Flag next to the search field on the front page.


"Most" is not the case in Europe, AFAIK the only non English speaking country that has an English version is Amazon DE. I usually end up searching for products there, then copying the ASIN and searching on Amazon in my neighbouring country (as they aren't available here [0]).

I really don't understand why e-commerce services do this, especially when the UI behind the scenes supports i18n and has translations. Amazon do have country specific listings, so maybe they just don't want to show two languages, but that is often what happens on Amazon DE if you choose English. If you choose to ship to a different country they will change prices to account for different taxes, so it's not that.

Other e-commerce services are even worse, e.g. Zalando has a single app for all their countries, and the listings are the same, but they don't let you choose the language at all. Just because I am living in a country, doesn't mean I speak the language of that country.

[0] Which is another weird thing, as most of the time they do ship here. I'm somewhat surprised they have't merged to a single "Amazon Europe" storefront, instead they are still opening new country specific storefronts.


i hate the english version of amazon.de, for some reason amazon keeps activating this for me, and now whenever i type in an english book title it "helpfully" translates it to german


I have this issue with Amazon Japan; it's frequently related to clicking links from friends or Google that have the language setting encoded in the URL, which changes my personal settings.


Do you use an English version of the browser or the OS? Maybe even on a phone because they could merge together all the inferred i18n preferences.


My browser is in English, with an accept header preference for English. But that does not mean, that I want shitty machine translated versions. After all, German is also in my accept header, so if that’s the only real version, please give me that one.


Amazon France annoyingly does not offer an English UI. I’m so looking forward for Firefox Translations to add support for French-to-English translations.


As someone who avoids Google stuff mainly for privacy concerns, I love this.

I wonder why Mozilla hasn't given their own extension the "Recommended" badge on their add-on site. https://addons.mozilla.org/en-US/firefox/search/?q=Translati...


> I wonder why Mozilla hasn't given their own extension the "Recommended" badge on their add-on site.

Probably because it isn't fully baked. It will definitely be included by default or set as "Recommended" when it supports more languages.


It still looks a bit rough, and lacks several languages that I translate often in Firefox using a Google Translate bookmarklet.

Really looking forward to this addon getting more mature, as bookmarklets don't work when you are e.g. logged in or in cases with heavy JS.


I am also excited about this, although I am usually not translating full pages so I still mostly use Google Translate even for supported languages. It sounds like they are working on ways to translate sections of a page. I use the Right Click Search addon and select text to translate then "search" with Google Translate. There is a word limit but Google even helpfully provides a link to the next chunk if you exceed it.

https://addons.mozilla.org/en-US/firefox/addon/right-click-s...

The few times I've tried translating news articles in supported languages with the addon it seems to do really well, although it is understandibly (and like Google Translate) more hit and miss for less formal stuff and song lyrics (which can be nearly untranslatable considering how little sense they often make if you do understand the language :/). This is translating to English though, it sounds like translating between other languages goes through English currently.


Thanks for the information. Very useful, I didn't know about this other addon.


If you're using a bookmarklet you can always give the deepl plugin a try: https://addons.mozilla.org/en-US/firefox/addon/deepl/


I just downloaded it to try it out (I live abroad, so I rely heavily on these extensions). Works fantastic and the UI is nice. I appreciate that you can choose to translate just one tab as you browse.

But on some websites it starts duplicating words over and over. So, seems like there are some rough edges to work out. But I'm definitely keeping it installed for when it is ready to go!


I use Bing translate because of that, but this is 100% worth looking into!


It's unfortunate that it's not available for Firefox mobile, for me at least, mobile is where I want translation. For example when traveling and viewing the sites of businesses


Also doesn't work on Firefox Nightly on Mobile.

However on the plus side, I'm not seeing a complete inability to translate in Nightly since the context menu let's me translate using Google Translate (i realise many won't want to send their text off to G but this does at least show up one of the claimed limitations in the article, that it simply wasn't possible at all)


Google Translate works offline if you download the language pair. If you block Google Translate from data then you have an offline translator.


Maybe it's too demanding to run on a devices with very little cooling and power? Just a guess


It requires x86/64 because it runs on WebAssembly.


That seems wrong? As far as I am aware WebAssembly is completely architecture-agnostic, and Firefox has implemented wasm VMs for both x86-84 and ARM.


And even if it wasn't architecture-agnostic, it's pretty easy to translate restricted and well-behaved machine code between architectures. Much easier than dealing with javascript itself.


This is a false statement


> It requires x86/64 because it runs on WebAssembly.

What is the technical requirement behind this?


There is none, one of the main objectives of WASM was to be a machine agnostic bytecode, similar to JVM bytecode for example. People have even built wasm VMs on FPGAs


Congratulations to all Firefox folks around here for building a much-desired offline translation plugin that does not spy on people.


On a similar note, how hard is it to bring a grammar checker offline. Today most folks rely on grammarly or similar services which are basically keyloggers.

Is there an open source initiative aimed at bringing grammar checking to the edge?


Google’s Grammar checker works completely offline.[1]

[1]https://ai.googleblog.com/2021/10/grammar-correction-as-you-...

(Disclosure: I work at Google)


Thanks for the pointer. This is an impressive job - reducing a grammar correcting model to as much as 20MB. Theoretically this could even be shipped to browsers and if we are able to wrap it in an extension that works everywhere, this could seriously compete with Grammarly.

I could understand why Google wouldn't open-source this tech, but the blog pretty much covers how to build one. I'm surprised there isn't any open source project that took this direction to bring a privacy-focused grammar checker.


Word 97 had grammar checking and wouldn't have used an online service for that.


And it's pretty bad. It works half decently on English, because the English grammar is rather inflexible, but for languages with more variation in word order the Office grammar checkers were laughably bad. They could only spot a few errors, and those were not even relevant to bad writers.

I'm not blaming anyone, the constraints made it nearly impossible, and at the time there wasn't enough properly tagged corpus material to work with, not even for English.


Similarly, macOS has had basic local grammar checking for text fields in native apps and Safari for around a decade now.


https://github.com/languagetool-org/languagetool is pretty good for grammar. It's quite trivial to work with the REST API, try one of the docker images.


In French, Grammalecte [1] is open source, works quite well and works offline.

[1] https://grammalecte.net/


”LanguageTool is an Open Source proofreading software for English, French, German, Polish, Russian, and more than 20 other languages. It finds many errors that a simple spell checker cannot detect.”

https://github.com/languagetool-org/languagetool https://languagetool.org/


After Grammarly stopped the service for users from Russia, I switched to LanguageTool, but found the experience much more lacking. It often suggests weird "fixes", for example replacing "look" with "onion", apparently because that's how "onion" in russian sounds.


Hemingway Editor is one such app that may provide what you seek. [1] It's not Open Source, but it isn't expensive for what it does. It's a fixed price for purchase, not an ongoing subscription.

[1] https://hemingwayapp.com/


>. It's a fixed price for purchase, not an ongoing subscription.

...and that is nowadays what i base my software choices (including mobile/ android) on.


The alternative to basically keyloggers does not necessarily have to be open source or offline. See other comments on Hemingway. Microsoft has also modern grammar checks now built into Word 365 / Outlook 365. They work as good as Grammarly without increasing trust surface. If you already are trusting the documents to Microsoft - might as well allow them to grammar check.



Once, when I saw someone using Grammarly during a screen presentation, I took an instant mental note to self-censor when chatting with that Person in the future, because this person does not care about digital privacy like I do and will broadcast at least parts of our conversations.


Does this do better than Meta's open source NLLB project?


That's what I was wondering as well. If I had to guess, the models that focus on tons of languages are often less good at say English->Spanish as a dedicated model, or a model that focuses on only a few high-resource languages.

Glad to see more offline-translation options though. Would be nice to have a benchmark for them soon to compare more easily.


Lack of support for Chinese or French is disappointing.


Looks like a french model was added 17 days ago https://github.com/mozilla/firefox-translations-models/tree/...


French, Polish, and Ukrainian are included in the beta versions of the extension.


Where can I find the beta version of the extension?


It's a very odd set of languages that they have. French should have orders of magnitude more resources available for developing a translation engine than e.g. Bulgarian would, so why do the more obscure language first?

Maybe they just happened to be able to get volunteers/partners who know those languages.


One of the main developers at Edinburgh is Bulgarian. :)


Possibly it has to do with who provided funding, or selection by a certain type of language families.


Can't be language families—they've got Spanish, Italian, and Portuguese, so French would be an obvious choice to include.

The inclusion of Estonian is particularly odd. It's a very small language (1.1m native) and is the only non-Indo-European language in production or development.


One of the partner universities in the original project was the University of Tartu in Estonia


Reminds me of the language learning app Lingvist, which for a while had only German, French, and Estonian for a similar reason: the company is Estonian.


Because French is a polysynthetic language. Agglutinative and polysynthetic languages have difficulty with contemporary language models.


Er, I don't think that's it. French is pretty far from being polysynthetic and is usually classified as analytic. Why do you say it's polysynthetic? Because of the few personal pronoun clitics it has?

I'd imagine the 14 grammatical cases of Estonian would be harder to handle than French grammar, and yet they have Estonian.


Maybe I'm a horrible person for wanting this, but how hard would it be to repurpose this as a device-local translation extension for Chrome/Chromium/Brave?


Was kind of hoping this would be an impressions post about its accuracy after some use but still glad to more awareness raised.


Good idea but why is this first-party add-on not compatible with the android Firefox app? That's a bit disingenuous.


Is it possible to use this without the browser, i.e. as simple plain old command line tool?


Asian languages are ignored. the world is only America and Europe.


Maybe it would look differently if Asian nations would contribute to the funding:

https://cordis.europa.eu/project/id/825303


Nor does it include African or native South American languages, right?

Even if this had not been sponsored by the EU, it wouldn't be surprising. This kind of product relies on decades of development. The smaller and poorer a language, the less research will have been done. Chinese and Japanese often have material available, but NLP research in other countries is still behind.



It's funded by the EU and it doesn't even cover all EU languages. This is just a small scale pilot project, it's unreasonable to expect them to cover 3000 languages.


Agreed. All of the languages I use semi-regularly are in SEA. All of the open source translation options tend to neglect Asian languages.

Google Translate seems to require an account or something on my microG'd phone and refused to work after an update, which meant I actually switched to Yandex--purely because I have no other services with them so translations are somewhat siloed away from other applications and services.


[flagged]


You're not wrong. Firefox really is always in last place for their features and Mozilla is entirely bankrolled by Google. So sending donations for trying to fund the browser is pointless as it is not going to that.

Mozilla is not leading in standards; always following big tech. It is now on life support as users do not care about using Firefox and have already declared Chrome (and the other derivatives) the winners. It's not even the first loser, which that is obviously Safari.

So I'm afraid that ship has sailed when it comes to the browser wars. It just went from one behemoth (Microsoft's Internet Explorer) to another (Google's Chrome), and Mozilla did nothing as everyone celebrated Google Chrome's dominance.


The key premise is that this is an important feature. I don't think it is that important to many users. Browsing poorly translated websites just isn't a thing. Especially not with Google translate, which in my experience almost but never quite makes sense with many languages.

About the only time it matters to me is when dealing with stuff I have to do (e.g. bureaucratic or legal things). In which case I prefer using deepl.com, which seems to do a better job of e.g. handling German legal texts.

I guess in a pinch it's handy to be able to translate a bit of text or a web page but that hardly is a regular thing. Browser extensions exist for that. Also for Firefox. I don't tend to use those though because I just don't really need or want that.


The real feature here is "on device", not the translation itself. A lot of people don't want to send all their foreign language browsing to third parties for translation.


It is absolutely essential when traveling to countries whose primary languages you don't understand. Mobile Firefox doesn't support any translation extensions, which means I use Chrome when traveling.


I am a hobbyist and frequently browse foreign language sites to engage with my hobbies beyond the Anglosphere. It’s remarkable how much more is out there when you start looking.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: