UBports Robot Logo UBports Forum
    • Categories
    • Recent
    • Tags
    • Popular
    • Users
    • Groups
    • Search
    • Register
    • Login
    1. Home
    2. idonthatevests
    I
    Offline
    • Profile
    • Following 0
    • Followers 0
    • Topics 1
    • Posts 18
    • Groups 0

    idonthatevests

    @idonthatevests

    26
    Reputation
    7
    Profile views
    18
    Posts
    0
    Followers
    0
    Following
    Joined
    Last Online

    idonthatevests Unfollow Follow

    Best posts made by idonthatevests

    • SOTY - Speech-To-Text Recognition on Ubuntu Touch

      Here, in this thread we discuss

      SPEECH RECOGNITION ENGINE on UBUNTU TOUCH

      Yeah, it's real. I've made it so that it could run on UT. Locally. Without sending it to someone's server

      This solution is called SOTY and is free software. It is a port of VOSK API wrapper for Kaldi - pretty neat speech recognition framework.

      Well, back to the point. Post your test results here, be surprised how (in)accurate the results are, review the source code and propose changes, ask questions on adapting your application to this feature, and don't forget to ask again why this isn't working in background and how to use that. It is recommended you read all this long post before posting yourself.

      What is it?

      It is a speech recognition server, which means it receives input (raw audio data) from a client, processes it and then sends back a transcription of a data being recorded on a client.

      The server itself is completely useless without a client, it doesn't even have an access to audio subsystem.
      The server was made to be combined with other software that would utilize speech recognition, where it could be useful

      So, right now this is more like a framework for developers, who might be interested in it.

      Installation

      Downloading the application from OpenStore will not be enough.
      You also need to install models.

      To install English language model you need to run these commands in terminal:

      Models can be installed using the in-application installer, which is accessible through "gear" icon in top right corner of the app interface.
      Now it supports transcribing in English. You can test your accent with...

      List of applications that work with SOTY STT

      UT Translator (recent update)
      To enable SOTY integration run in terminal:

      sed -i 's/enableSTT=false/enableSTT=true/g' /home/phablet/.config/ut-dictionary-frontend.ut-dictionary-frontend/ut-dictionary-frontend.ut-dictionary-frontend.conf
      
      

      Then after installing and running SOTY properly

      1. Open SOTY first and start server
      2. Open UT Translator (WITHOUT CLOSING SOTY SERVER)
      3. Choose English language. Microphone icon will appear on a top panel. Click on it to start recording audio.

      I hope this list will grow bigger over time

      (I would be more than happy to have it integrated in lomiri keyboard, and that would probably eliminate the need to integrate it with any other app, but I don't know if that's ever going to happen)

      Quality

      It is now possible to transcribe everything you say on a device locally, your smartphone that runs Ubuntu Touch could totally do that.
      Too good to be true. There of course are limitations.
      If we use small models, which is the current case, they won't cover all the words in language. And our small models are not good at transcribing previously unknown words and separate letters.

      You could try using models that are much bigger and intended for use on servers, but they, however would require more RAM and more time to process your data. It is significally slower. You will not like it. Implementing VAD preprocessing might help a little. And might not.

      Another fly in the ointment is that models currently in use are not helpful with spelling words. At all. You need to re-train them for this specific task.

      Summary

      I hope it has some potential. Will it evolve into an open-source voice assistant for your device of the future, or will it remain a funny conceptual toy, it's up to you, dear Community.

      How can I help

      Here's what you can do for this project:

      • Design
      • Code improvements
      • Guides for other people
      • If you are an app developer : think of ways it could be useful in your application
      • Testing and reporting bugs
      • (The most important)Improving models

      Plans

      Add models installer.
      Make it configurable.
      System OSK integration.

      Improving models

      Under construction

      posted in App Development
      I
      idonthatevests
    • RE: SOTY - Speech-To-Text Recognition on Ubuntu Touch

      New version is out
      Introducing protocol v2 - now client apps do not need to use the microphone, audio recording is performed on a server. There are some flaws for this decision, though.
      CPU load has been reduced in this release
      This release is also backwards compatible with protocol v1, where client sends recorded data to server, just in case someone wants to send audio data from other sources.

      Also, my apologies, I forgot to link
      Client library for v2 protocol
      Client library for v1 protocol (requires Microphone permission)

      posted in App Development
      I
      idonthatevests
    • RE: [Call for] Nominations for the UBports Community Awards

      This idea has some flaws: there are many good devs and their apps, with these restrictions probably some of them would be undeservedly not mentioned here, and some, despite putting a lot of work in this project, may be not nominated because we don't usually see them where we see other devs.

      I would like to nominate the following apps:

      • Waydroid because we all know why
      • LogViewer since it makes debugging much less painful

      Developers, who absolutely deserve mentioning here:

      • Danfro
      • fredldotme
      posted in General
      I
      idonthatevests
    • RE: SOTY - Speech-To-Text Recognition on Ubuntu Touch

      New version is out
      Changes:

      • Fixed a few bugs
      • Added models installer (works for many languages listed in the menu, other models will be uploaded later)
      • The application UI can be translated.
      • Now it comes with amd64 build
      posted in App Development
      I
      idonthatevests
    • RE: SOTY - Speech-To-Text Recognition on Ubuntu Touch

      The Client library for v2 protocol is now a complete QML plugin, which can be easily added to your application and then used in your QML layout. The repository contains all steps to integrate speech recognition client in your project. No permissions needed, the only requirement is server application running in background locally.

      posted in App Development
      I
      idonthatevests
    • RE: SOTY - Speech-To-Text Recognition on Ubuntu Touch

      @undrwater
      Thanks for your interest. You can easily integrate TTS support in your application using espeak-ng. However, espeak data takes 20 MBytes of user storage space. If you want this functionality for Soty server, that would require changing communication protocol for both server and client. It also would not be too hard, but I personally think we should look for a more accurate solution for this task, that could be seamlessly integrated in system, such as speech-dispatcher.

      posted in App Development
      I
      idonthatevests
    • RE: SOTY - Speech-To-Text Recognition on Ubuntu Touch

      @undrwater said in SOTY - Speech-To-Text Recognition on Ubuntu Touch:

      @idonthatevests I assume you're using LLM for STT. I'm doing that on my desktop in a python venv.

      No, I use small ASR models. Running LLM on an old mobile CPU for this task would likely make the speech recognition expensive and slow. And I think the same situation would be with attempts to use it for speech synthesis. So, using LLMs for that on mobile OS is probably possible, but only if you implement it for non time-critical tasks. Yet, in my opinion espeak-ng is still a fine option for that and is highly configurable.

      I'm still figuring my way around how UT is organized (I use gentoo, and it's quite different).

      There are many things in UT that are not organized yet, but that's what is great about UT for me, that you can do it yourself! Have fun with your research

      posted in App Development
      I
      idonthatevests
    • RE: SOTY - Speech-To-Text Recognition on Ubuntu Touch

      @undrwater
      There's an automatic installer in Soty app. It is achieved by clicking settings button (gear pictogram) in the right corner of top panel.
      For manual installation, models should be put in .local/share/kl.soty
      Now I see that it's my mistake. I'm sorry for providing not working instructions. Editing op post right now

      posted in App Development
      I
      idonthatevests
    • RE: SOTY - Speech-To-Text Recognition on Ubuntu Touch

      @undrwater
      And now I was able to reproduce it... It was all caused by that the directory kl.soty/models isn't properly creared. That's another shame on me. Thanks for finding it

      posted in App Development
      I
      idonthatevests
    • RE: SOTY - Speech-To-Text Recognition on Ubuntu Touch

      Thanks for the feedback, I will keep improving this tool

      posted in App Development
      I
      idonthatevests

    Latest posts made by idonthatevests

    • RE: SOTY - Speech-To-Text Recognition on Ubuntu Touch

      Thanks for the feedback, I will keep improving this tool

      posted in App Development
      I
      idonthatevests
    • RE: SOTY - Speech-To-Text Recognition on Ubuntu Touch

      Models installer has been fixed in new release, the update will appear soon on the OpenStore

      posted in App Development
      I
      idonthatevests
    • RE: SOTY - Speech-To-Text Recognition on Ubuntu Touch

      @undrwater
      And now I was able to reproduce it... It was all caused by that the directory kl.soty/models isn't properly creared. That's another shame on me. Thanks for finding it

      posted in App Development
      I
      idonthatevests
    • RE: SOTY - Speech-To-Text Recognition on Ubuntu Touch

      @undrwater
      I just re-read my own message and realized the reason why it refuses to run from command line was because I provided wrong file paths in my instruction again. I'm sorry for for wasting your time on not working instructions again. Such an embarassment...
      The correct way to run it from command line would be

      export LD_LIBRARY_PATH="$LD_LIBRARY_PATH:/opt/click.ubuntu.com/kl.soty/current/lib/aarch64-linux-gnu"
      /opt/click.ubuntu.com/kl.soty/current/kl.soty
      
      posted in App Development
      I
      idonthatevests
    • RE: SOTY - Speech-To-Text Recognition on Ubuntu Touch

      @undrwater
      Well, I've never got anything like that before, which is funny, because I use the same phone model and the same OS. Perharps you are using 24.04-2.x? If that is the case, it is not recommended, unless you need a few features that are not in 24.04-1.x, such as volte and proper Docker support, as this version is still under heavy development, and I don't have it yet and for this reason I could not even test it. But if you are willing to invest some time in it, I think we could find a way to get it running properly, but that might require a lot of testing on your side. I see a lot of people chose running bleeding edge, so 24.04.2 support might quickly become a requirement.

      I installed both through OpenStore

      Does in-application models installer functionality work for UT Translator or are you getting the same results as for SOTY?

      About VOSK errors, could you share the contents of en directory?

      UPD:
      Seems like VoLTE has been rolled out for 24.04.1 on Nord N10 too.

      posted in App Development
      I
      idonthatevests
    • RE: SOTY - Speech-To-Text Recognition on Ubuntu Touch

      @undrwater
      It seems like some unusual bug here. The installer is specifically programmed to show all progress changes immediately, so it is probably really stuck somewhere. This message should only appear if a button is clicked multiple times and should be overriden almost immediately. Is there no other message displayed? On what architecture are you trying to run it? Is it mobile UT installation? What version of the UT? Can you check with netstat tool if it at least tries to connect to gitlab? Besides, UT Translator uses the same code for installation process, does it also not work for you?

      You can of course try running it from command line, but that would still require GUI.

      export LD_LIBRARY_PATH="$LD_LIBRARY_PATH:/opt/click.ubuntu.com/kl.soty/current/lib/aarch64-linux-gnu"
      /opt/click.ubuntu.com/kl.soty/current/kl.soty
      

      If you want to see all printed messages, you may also use the LogViewer app from the OpenStore too.

      Thanks for reporting this issue. I hope we can make the installer work properly.

      posted in App Development
      I
      idonthatevests
    • RE: SOTY - Speech-To-Text Recognition on Ubuntu Touch

      @undrwater
      There's an automatic installer in Soty app. It is achieved by clicking settings button (gear pictogram) in the right corner of top panel.
      For manual installation, models should be put in .local/share/kl.soty
      Now I see that it's my mistake. I'm sorry for providing not working instructions. Editing op post right now

      posted in App Development
      I
      idonthatevests
    • RE: SOTY - Speech-To-Text Recognition on Ubuntu Touch

      @undrwater said in SOTY - Speech-To-Text Recognition on Ubuntu Touch:

      @idonthatevests I assume you're using LLM for STT. I'm doing that on my desktop in a python venv.

      No, I use small ASR models. Running LLM on an old mobile CPU for this task would likely make the speech recognition expensive and slow. And I think the same situation would be with attempts to use it for speech synthesis. So, using LLMs for that on mobile OS is probably possible, but only if you implement it for non time-critical tasks. Yet, in my opinion espeak-ng is still a fine option for that and is highly configurable.

      I'm still figuring my way around how UT is organized (I use gentoo, and it's quite different).

      There are many things in UT that are not organized yet, but that's what is great about UT for me, that you can do it yourself! Have fun with your research

      posted in App Development
      I
      idonthatevests
    • RE: SOTY - Speech-To-Text Recognition on Ubuntu Touch

      @undrwater
      Thanks for your interest. You can easily integrate TTS support in your application using espeak-ng. However, espeak data takes 20 MBytes of user storage space. If you want this functionality for Soty server, that would require changing communication protocol for both server and client. It also would not be too hard, but I personally think we should look for a more accurate solution for this task, that could be seamlessly integrated in system, such as speech-dispatcher.

      posted in App Development
      I
      idonthatevests
    • RE: Calendar and Alarms issue

      you can try clearing app cache using the UT Tweak Tool

      posted in Support
      I
      idonthatevests