• kibiz0r@midwest.social
    link
    fedilink
    English
    arrow-up
    48
    ·
    5 months ago

    Interacting with people whose tone doesn’t match their words may induce anxiety as well.

    Have they actually proven this is a good idea, or is this a “so preoccupied with whether or not they could” scenario?

    • Admiral Patrick@dubvee.org
      link
      fedilink
      English
      arrow-up
      43
      ·
      edit-2
      5 months ago

      Have they actually proven this is a good idea, or is this a “so preoccupied with whether or not they could” scenario?

      It’s businesses “throwing AI into stuff”, so I’m going to say it’s a safe bet it’s the latter.

  • Nath@aussie.zone
    link
    fedilink
    arrow-up
    36
    ·
    5 months ago

    The biggest problem I see with this is the scenario where calls are recorded. They’re recorded in case we hit a “he said, she said” scenario. If some issue were to be escalated as far as a courtroom, the value of the recording to the business is greatly diminished.

    Even if the words the call agent gets are 100% verbatim, a lawyer can easily argue that a significant percentage of the message is in tone of voice. If that’s lost and the agent misses a nuance of the customer’s intent, they’ll have a solid case against the business.

    • Sneezycat@sopuli.xyz
      link
      fedilink
      arrow-up
      5
      ·
      5 months ago

      I see no problem: they can record the original call and postprocess it with AI live for the operators. The recordings would be the original audio.

      • geissi@feddit.de
        link
        fedilink
        English
        arrow-up
        11
        ·
        5 months ago

        Besides providing verbatim records of who said what, there is a second can of worms in forming any sort of binding agreement if the two sides of the agreement are having two different conversations.

        I think this is what the part about the missed nuance means.

  • Admiral Patrick@dubvee.org
    link
    fedilink
    English
    arrow-up
    25
    ·
    edit-2
    5 months ago

    This is giving me Black Mirror vibes. Like when that lady’s consciousness got put into a teddy bear, and she only had two ways to express herself:

    • Monkey wants a hug
    • Monkey loves you

    I get that you shouldn’t go off on customer service reps (the reason you’re angry is never their fault), but filtering out the emotion/intonation in your voice is a bridge too far.

    • TachyonTele
      link
      fedilink
      arrow-up
      14
      ·
      5 months ago

      Most of the time angry customers don’t even understand what they’re angry at. They’ll 180 in a heartbeat if the agent can identify the actual issue. I agree, this is unnecessary.

      • Banzai51@midwest.social
        link
        fedilink
        English
        arrow-up
        6
        ·
        5 months ago

        Based on my experience working in a call center, I wouldn’t call it unnecessary. People are fucked up.

        • TachyonTele
          link
          fedilink
          arrow-up
          6
          ·
          5 months ago

          It’s not an easy job, and it can absolutely be rough and frustrating. But knowing what your customer is saying is pretty important.

        • Nath@aussie.zone
          link
          fedilink
          arrow-up
          4
          ·
          5 months ago

          I did phones in a different century, so I don’t know whether this would fly today. But, my go-to for someone like this was “ok, I think I see the problem here. Shall we go ahead and fix it or do you need to do more yelling first?

          I can’t remember that line ever not shutting them down instantly. I never took it personally, whatever they had going on they were never angry at me personally.

          Then again, I do remember firing a couple of customers (“we don’t want your business any more etc”) after I later became a manager and people were abusive to staff. So you could be right, also.

          • TachyonTele
            link
            fedilink
            arrow-up
            3
            ·
            edit-2
            5 months ago

            Haha while I love the line, that last part would quickly get you pulled into a talk with management.

            I would laugh, and then tell you to never do that again.

      • Admiral Patrick@dubvee.org
        link
        fedilink
        English
        arrow-up
        5
        ·
        edit-2
        5 months ago

        Yep, 100%.

        In college, I worked at a call center for one of the worst Banks of America (oops, meant banks in America 😉). Can confirm that, and I dealt with a LOT of angry customers.

  • Cybrpwca@beehaw.org
    link
    fedilink
    English
    arrow-up
    19
    ·
    5 months ago

    I think I get what the article is saying, but all I can imagine is Siri calmly reading to me the vilest insults ever written.

  • jet@hackertalks.com
    link
    fedilink
    English
    arrow-up
    11
    ·
    5 months ago

    If they’re going to do this, then customers can get support via text messaging right? They’re not going to have to call in to talk to a computer to have their voice turned into text for an agent right?

    This isn’t about asymmetrically wasting the time of the customer so they don’t call support at all, right?

  • perishthethought
    link
    fedilink
    English
    arrow-up
    10
    ·
    5 months ago

    Am I crazy or is 10,000 samples nowhere near enough for training people’s voices?

    • eveninghere@beehaw.org
      link
      fedilink
      arrow-up
      3
      ·
      5 months ago

      If you have pre-trained model or a classical voice matching algorithm as the basis, few samples might suffice.

    • Kissaki@beehaw.org
      link
      fedilink
      English
      arrow-up
      1
      ·
      5 months ago

      I don’t think it seems like too few samples for it to work.

      What they train for is rather specific. To identify anger and hostility characteristics, and adjust pitch and inflection.

      Dunno if you meant it like that when you said “training people’s voices”, but they’re not replicating voices or interpreting meaning.

      learned to recognize and modify the vocal characteristics associated with anger and hostility. When a customer speaks to a call center operator, the model processes the incoming audio and adjusts the pitch and inflection of the customer’s voice to make it sound calmer and less threatening.

    • sunzu@kbin.run
      link
      fedilink
      arrow-up
      1
      ·
      5 months ago

      Doubtful that’s enough to do anything useful, maybe if data is great and perfectly tuned with some guidance?

  • blindsight@beehaw.org
    link
    fedilink
    arrow-up
    7
    ·
    edit-2
    5 months ago

    This seems like it might work really well. We’ve evolved to be social creatures, and internalizing the emotions of others is literally baked into our DNA (mirror neurons), so filtering out the emotional “noise” from customers seems, to me, like a brilliant way to improve the working conditions for call centre workers.

    It’s not like you can’t also tell the emotional tone of the caller based on the words they’re saying, and the call centre employees will know that voices are being changed.

    Also, I’m not so sure about reporting on anonymous Redditor comments as the basis for journalism. I know why it’s done, but I’d rather hear what a trained psychologist has to say about this, y’know?

  • bitwolf@lemmy.one
    link
    fedilink
    arrow-up
    6
    ·
    5 months ago

    Dang, swearing was one of my strategies to get the bot to forward me to a representative

  • Xirup@yiffit.net
    link
    fedilink
    arrow-up
    4
    ·
    edit-2
    5 months ago

    In my country, 99% of the time you contact technical support, a poorly made bot responds (actually it is a while loop) with ambiguous and pre-written answers, and the only way to talk to a human is directly by going to the place in question, so nothing to worry about that here.

  • AutoTL;DR@lemmings.worldB
    link
    fedilink
    English
    arrow-up
    2
    ·
    5 months ago

    🤖 I’m a bot that provides automatic summaries for articles:

    Click here to see the summary

    According to a report from the Japanese news site The Asahi Shimbun, SoftBank’s project relies on an AI model to alter the tone and pitch of a customer’s voice in real-time during a phone call.

    SoftBank’s developers, led by employee Toshiyuki Nakatani, trained the system using a dataset of over 10,000 voice samples, which were performed by 10 Japanese actors expressing more than 100 phrases with various emotions, including yelling and accusatory tones.

    By analyzing the voice samples, SoftBank’s AI model has reportedly learned to recognize and modify the vocal characteristics associated with anger and hostility.

    In a Reddit thread on Softbank’s AI plans, call center operators from other regions related many stories about the stress of dealing with customer harassment.

    Harassment of call center workers is a very real problem, but given the introduction of AI as a possible solution, some people wonder whether it’s a good idea to essentially filter emotional reality on demand through voice synthesis.

    By reducing the psychological burden on call center operators, SoftBank says it hopes to create a safer work environment that enables employees to provide even better services to customers.


    Saved 78% of original text.