Thumbnail for 120449

Dasha AI is calling so you don’t have to

While you’d be hard pressed to find any startup not brimming with confidence over the disruptive idea they’re chasing, it’s not often you come across a young company as calmly convinced it’s engineering the future as Dasha AI.

The team is building a platform for designing human-like voice interactions to automate business processes. Put simply, it’s using AI to make machine voices a whole lot less robotic.

“What we definitely know is this will definitely happen,” says CEO and co-founder Vladislav Chernyshov. “Sooner or later the conversational AI/voice AI will replace people everywhere where the technology will allow. And it’s better for us to be the first mover than the last in this field.”

“In 2018 in the US alone there were 30 million people doing some kind of repetitive tasks over the phone. We can automate these jobs now or we are going to be able to automate it in two years,” he goes on. “If you multiple it with Europe and the massive call centers in India, Pakistan and the Philippines you will probably have something like close to 120M people worldwide… and they are all subject for disruption, potentially.”

The New York based startup has been operating in relative stealth up to now. But it’s breaking cover to talk to TechCrunch — announcing a $2M seed round, led by RTP Ventures and RTP Global: An early stage investor that’s backed the likes of Datadog and RingCentral. RTP’s venture arm, also based in NY, writes on its website that it prefers engineer-founded companies — that “solve big problems with technology”. “We like technology, not gimmicks,” the fund warns with added emphasis.

Dasha’s core tech right now includes what Chernyshov describes as “a human-level, voice-first conversation modelling engine”; a hybrid text-to-speech engine which he says enables it to model speech disfluencies (aka, the ums and ahs, pitch changes etc that characterize human chatter); plus “a fast and accurate” real-time voice activity detection algorithm which detects speech in under 100 milliseconds, meaning the AI can turn-take and handle interruptions in the conversation flow. The platform can also detect a caller’s gender — a feature that can be useful for healthcare use-cases, for example.

Another component Chernyshov flags is “an end-to-end pipeline for semi-supervised learning” — so it can retrain the models in real time “and fix mistakes as they go” — until Dasha hits the claimed “human-level” conversational capability for each business process niche. (To be clear, the AI cannot adapt its speech to an interlocutor in real-time — as human speakers naturally shift their accents closer to bridge any dialect gap — but Chernyshov suggests it’s on the roadmap.)

“For instance, we can start with 70% correct conversations and then gradually improve the model up to say 95% of correct conversations,” he says of the learning element, though he admits there are a lot of variables that can impact error rates — not least the call environment itself. Even cutting edge AI is going to struggle with a bad line.

The platform also …read more

0 replies

Leave a Reply

Want to join the discussion?
Feel free to contribute!

Leave a Reply