Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

What prevents the system to remember your previous choices?

Then it can assume you choice haven't changed, and propose you a solution that matches your previous choices. And to give the user control it just needs to explicitly tell the user about the assumption it made.

In fact, a smart enough system could even see when violating the assumptions could lead to a substantial gain and try convincing the user that it may be a good option this time.



It still has to tell you. Visually in a form it's much faster. Similar reason why many people prefer a blog post over a video.

Talking is not very efficient, and it's serial in fixed time. With something visual you can look at whatever you want whenever you want, at your own (irregular) pace.

You will also be able to make changes much faster. You can go to the target form element right away, and you get immediate feedback from the GUI (or from a physical control that you moved - e.g. in cars). If it's talk, you need to wait to have it said back to you - same reason as why important communication in flight control or military is always read back. Even humans misunderstand. You can't just talk-and-forget unless you accept errors.

You would need some true intelligence for just some brief spoken requests to work well enough. A (human) butler worked fine for such cases, but even then only the best made it into such high-level service positions, because it required real intelligence to know what your lord needed and wanted, and lots of time with them to gain that experience.


> It still has to tell you. Visually in a form it's much faster.

Who said it cannot be visual? It's still a “conversational” UI if it's a chatbot that writes down its answer.

> Similar reason why many people prefer a blog post over a video.

Well I certainly do, but I also know that we are few and far between in that case. People in general prefer videos over blog post by a very large margin.

> Talking is not very efficient, and it's serial in fixed time. With something visual you can look at whatever you want whenever you want, at your own (irregular) pace. You will also be able to make changes much faster. You can go to the target form element right away, and you get immediate feedback from the GUI.

Saying “I want to travel to Berlin next monday” is much faster than fighting with the website's custom datepicker which will block you until you select your return date until you realize you need to go back and toggle the “one way trip” button before clicking the calendar otherwise it's not working…

There's a reason why nerds love their terminal: GUIs are just very slow and annoying. They are useful for whatever new thing you're doing, because it's much more discoverable than CLI, but it's much less efficient.

> If it's talk, you need to wait to have it said back to you - same reason as why important communication in flight control or military is always read back. Even humans misunderstand. You can't just talk-and-forget unless you accept errors.

This is true, but stays true with a GUI, that's why you have those pesky confirmation pop-ups, because as annoying as they are when you know what you're doing, they are necessary to catch errors.

> You would need some true intelligence for just some brief spoken requests to work well enough.

I don't think so. IMO you just need something that emulates intelligence enough on that particular purpose. And we've seen that LLMs are pretty decent at emulating apparent intelligence so I wouldn't bet against them on that.


> Who said it cannot be visual? It's still a “conversational” UI if it's a chatbot that writes down its answer.

You can't be serious??

Oh it's 1st of April, my apologies! I almost took it seriously. I should ignore this website on this day.


I don't understand your complaint.

What's the difference between a blog post and a chatbot answer in terms of how “visual” things are?


> Similar reason why many people prefer a blog post over a video.

I used to be a reading blog over watching video person, but for some things I’ve come to appreciate the video version. The reason you want to get the video of the whatever is because in the blog post, what’s written down only what the author thought was important. But I’m not them. I don’t know everything they know and I don’t see everything they see. I can’t do everything they do but with the video I get everything. When you perform the whatever the video has every detail, not just the ones you think are important. That bit between step 1 and step 2 that’s obvious? It’s not obvious to everyone, or mine is broken in a slightly different way that I really need to see that bit between 1 and 2. of course, videos get edited and cut so they don’t always have that benefit, but I’ve grown to appreciate them.


The previous choice might not what I want today.

Maybe I'm tired of layovers and I'm willing to pay more for a direct flight this time. Maybe I want a different selection at a restaurant because I'm in the mood for tacos rather than a burrito.


Just tell it then.


And then we're back to point one: retelling the whole stack of choices every time because nobody on the other side of the conversation, person or AI; can tell whether all my previous options are still valid. Because even I, the caller, might not remember what "defaults" I set in the previous call. So yeah, this argument in favor of conversational interfaces sounds at this point more like ideology than logic.


> every time because nobody on the other side of the conversation, person or AI; can tell whether all my previous options are still valid.

But you can, so as long as the interlocutor tells you what assumptions it made, you can correct it if it doesn't match your current mood.

> So yeah, this argument in favor of conversational interfaces sounds at this point more like ideology than logic.

There's no ideology behind the fact that every people rich enough to afford paying someone to deal with mundane stuff will have someone doing it for them, it's just about convenience. Nobody likes to fight with web UIs for fun, the only reason why it has become mainstream is because it's so much cheaper than having a real person working.

Same for Microsoft Word by the way, many people used to have secretaries typing stuff for them, and it's been a massive regression of social status for the upper middle class to have to type things by themselves, it only happened because it was cheaper (in appearance at least).


Okay I think I finally get your point, and I even agree. The comparison with an executive assistant doesn't help much here, because the CEO interacts with only one person over all those delegatable activities, and the expectations are that person already knows all the defaults. That's what makes it smooth. This doesn't scale when you must deal with a different AI for each interaction. Will we get to a (scary maybe) point where Siri/Alexa/whoever can actually be that personal assistant? Maybe, but we're still far from it. So at least for today, the conversational interface is an extra burden. And tomorrow, we'll see.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: