AI might actually make more work for people, not less – study

03.09.2024 19:38

Read/Write Web

The use of generative artificial intelligence (AI) has exploded in popularity since ChatGPT launched in November 2022, but the efficiency of the technology has now come into question as a government trial has found AI could potentially create more work for people.

With companies worldwide scrambling to incorporate AI to stay ahead of the curve, the technology has become a major topic in the corporate world. From worries about an AI takeover of jobs to its use becoming more commonplace generally (ChatGPT has reached 200 million weekly users), the trial aimed to see how the tool actually performs in a workplace setting and with business-related tasks.

Amazon and Australia’s corporate regulator, the Securities and Investments Commission (ASIC), ran the test earlier this year and the outcomes have since been shared in a select committee meeting.

The test included the use of the generative AI model from Meta, the open-source Llama2-70B. The technology was prompted to summarize submissions with a focus on mentions of ASIC, recommendations, and references to regulation.

The tool was also asked to include context and page references. Members of staff from the regulatory body, 10 in total, were then given the same task with similar prompts.

The responses from both the AI and human employees were then blindly assessed by a group of reviewers who looked for coherency, length, ASIC references, regulation references, and for identifying recommendations. At the time, the reviewers weren’t aware that AI was involved.

Results of Australian government trial that pitted AI against human employees

The human summaries beat out their AI counterparts in every criterion, gaining them a score of 81% in comparison to the technology’s 47%.

Where the AI did particularly badly was in finding references to ASIC within the submission documents. In the results section in the trial report, the team experimenting said: “Finding references in larger documents is a notoriously hard task for LLMs due to context window limitations and embedding strategies.

“Page references are not traditionally stored in the embedding models as the contents of PDF documents are ingested as plain text. To achieve better accuracy with this issue, substantial progress was made by splitting documents into pages and treating pages as chunks with associated metadata.”

Some of the AI responses were also described as being “waffly” and “wordy,” with a lack of formatting and unsatisfactory following of the requests in the prompt.

The reviewers had “to refer back to the source material to confirm AI summary details,” and “assessors generally agreed that the AI outputs could potentially create more work if used (in current state), due to the need to fact check outputs, or because the source material presented information better.”

Featured Image: Via Ideogram

The post AI might actually make more work for people, not less – study appeared first on ReadWrite.