# What we have and where we're going
Thanks for the interest in our project! We're pretty excited to finally show this to the public. That being said, Tometo is by no means done, nor is it even in a state where we can put it online in some sort of beta phase, but we have a fairly clean plan on what needs to happen for us to get there.
I got the initial proof of concept working back in April. Proof of concept here basically means that I wanted to see whether it was possible to automatically generate text-to-speech content, and align the text at the same time (this basically means that the words light up as the text is being spoken). I finally got this working through an extremely obscure combination of various technologies that I would rather not repeat here. Here's a video of when that first really worked:
<video controls src="https://files.catbox.moe/cewoy8.mp4"></video>
Since then, I've been working on making the core process of generating and aligning the speech more scalable and, most importantly, faster. We've switched forced aligners (the program that aligns speech with its transcript) twice and finally managed to push it under one second, which means that from clicking the button to seeing your status, it'll roughly take a second or two.
I also presented the proof of concept at a local Rust meetup:
<img src="https://files.catbox.moe/98posk.jpg" height=500 />
## What's the plan?
We have a milestone for what we want to be in our __initial release__ (basically the first time we version the software) that can be tracked [here](https://marisa.cloud/tometo/issues/issues?milestone_title=first-version). Among this is implementing an Avatar system, akin to what you would have as Miis in Miitomo. The rudimentary system for this is to let the user upload two images, one for when the mouth is closed and one for when it's opened, but we'll expand on this after the first release (we'll also probably write more updates on this).
Ideally, the first release is when we'll also have a live version for you all to try, but no guarantees on that yet. We'll update you when we get closer to that point!
Here's some other things that we've identified as necessary to have in our first release:
- Creating a blacklist and preventing statuses that match it from being created
- _Basic_ moderation tools
- Allow users to change pitch and speed per-avatar
- Implementing more secure authentication (moving from JWT to session cookies)
If you have any ideas for what should _definitely_ be in the first release, feel free to open a new issue on the tracker and suggest it to us, or throw it at us on our [Discord] server.
## What can I do to help?
If you're interested in contributing on a technical level, feel free to join the [Discord] server and we'll give you pointers on where to go from there. If you're proficient in JavaScript, Vue, Rust (specifically with Actix) and want to do a bunch of unpaid labor, we warned you!
Other than that, if something specific comes up, we'll tweet about it on [Twitter] and talk about it on Discord, so those are the places to be.
## Can you give any concrete times?
No. This is a volunteer project, and we're not getting paid for this in the first place, so concrete times are out of the question.
With that, one last reminder that if you want infrequent updates about the state of the project, [Twitter] is the place to follow us on. We'll write more updates, too.
[Discord]: https://discord.gg/3cwfrMR
[Twitter]: https://twitter.com/tometo_official