AI arm of Sony Analysis to assist develop giant language mannequin with AI Singapore

three speech bubbles on string

devrimb/Getty Pictures

Sony Analysis has inked a partnership to assist check and finetune the Southeast Asian Languages in One Community (SEA-LION) synthetic intelligence (AI) mannequin, specializing in Indian languages. 

The AI arm of Sony Analysis will work with AI Singapore (AISG) liable for the event of SEA-LION, to plug gaps in making certain the big language mannequin (LLM) stands up effectively on the worldwide panorama, representing the area’s populations and languages. The companions stated in an announcement Tuesday their analysis collaboration will contain LLMs underneath the SEA-LION umbrella, all of that are pre-trained and instruct-tuned particularly on Southeast Asian cultures and languages. 

The open-source LLM has been skilled on 981 billion language tokens, which AISG defines as fragments of phrases created from breaking down textual content throughout the tokenization course of. These fragments embrace 623 billion English tokens, 128 billion Southeast Asia tokens, and 91 billion Chinese language tokens.  

Additionally: Misplaced in translation: AI chatbots nonetheless too English-language centric, Stanford examine finds

The partnership entails that Sony will work on assessments and suggestions on the AI mannequin, tapping the Japanese vendor’s analysis presence in India and experience within the growth of LLMs for Indian languages (together with Tamil). Tamil is estimated for use by 60 million to 85 million folks globally, most of whom are based mostly in India and Southeast Asia. 

Sony will trade greatest practices on LLM growth and analysis methodologies, in addition to the applying of its analysis in speech technology, content material evaluation, and recognition. 

The combination of the SEA-LION AI mannequin with Tamil language capabilities has the potential to spice up the efficiency of recent functions, stated AISG’s senior director of AI merchandise Leslie Teo. He added that the Singapore company may even share its data and greatest practices in LLM growth. 

Additionally: AI leaders urged to combine native knowledge fashions for range’s sake

IBM and Google are amongst different trade gamers drawn into finetuning the regional LLM, together with making it accessible for builders to construct personalized AI functions. 

“Entry to LLMs that tackle the worldwide panorama of language and tradition has been a barrier to driving analysis and growing new applied sciences which are consultant and equitable for the worldwide populations we serve,” stated Hiroaki Kitano, president of Sony Analysis. “Variety and localization are important forces. In Southeast Asia particularly, there are greater than 1,000 totally different languages spoken by the residents of the area. This linguistic range underscores the significance of making certain AI fashions and instruments are designed to help the wants of all populations around the globe.”

Additionally: Transparency is sorely missing amid rising AI curiosity

Established in April 2023, Sony Analysis focuses on technological growth that may enhance content material creation and fan engagement, together with within the areas of AI, sensing, and digital areas. As an example, its deep studying analysis staff has been engaged on applied sciences that embody, amongst others, mannequin compression and neural rendering, which it hopes may be built-in into Sony’s GUI growth device Neural Community Console, and open-source libraries Neural Community Libraries. 

These applied sciences can be utilized in AI-powered electronics merchandise spanning varied sectors, akin to video games, films, and music, and video games, Sony stated. 

Its interactive leisure unit has filed a patent for a “harassment detection equipment” that features an enter unit constructed to obtain biometric knowledge and with capabilities to generate, based mostly on biometric knowledge, emotion knowledge related to customers, in keeping with an April 2024 publication on World Mental Property Group’s PatentScope search platform.

Additionally: New world customary goals to construct safety round giant language fashions

With the system, Sony hopes to have the ability to detect and mitigate communications between people in multi-player video games or digital actuality experiences which are malicious, akin to harassment. Tapping machine studying and AI fashions, the system can detect biometric knowledge akin to speech and decide a participant’s emotional state, for example, by means of sounds akin to sobbing and screaming. These could also be used to determine victims of harassment inside the shared surroundings, in keeping with the submitting. 

In Might, Sony Music Group launched an announcement noting that its artists’ copyrighted works, together with compositions, lyrics, and audio recordings, shouldn’t be scraped and used to coach AI fashions until explicitly approved.


Leave a Reply

Your email address will not be published. Required fields are marked *