Indica un intervallo di date:
  • Dal Al
tecnologia

Cosa misura l’Ai Index Report della Stanford University? #howmeasuring

L’Università di Stanford ha pubblicato il suo settimo rapporto sull’indice AI. Il rapporto, che è alla sua settima edizione,  illustra i progressi tecnici nell’intelligenza artificiale, la percezione pubblica della tecnologia e le dinamiche geopolitiche che condizionano il suo sviluppo.  Il rapporto di quest’anno , pubblicato dallo Stanford Institute for Human-Centered Artificial Intelligence (HAI) , contiene un capitolo ampliato sull’intelligenza artificiale responsabile e nuovi capitoli sull’intelligenza artificiale nella scienza e nella medicina, oltre alle consuete raccolte di ricerca e sviluppo, prestazioni tecniche, economia , istruzione, politica e governance, diversità e opinione pubblica.  IEEE Spectrum è una rivista e un sito web gestiti dall’IEEE (Institute of Electrical and Electronics Engineers), che è la più grande associazione tecnico-professionale del mondo dedicata all’avanzamento della tecnologia per il beneficio dell’umanità. IEEE Spectrum si occupa di fornire informazioni e analisi su tendenze e sviluppi in ingegneria, scienza e tecnologa. Ecco cinque grafici che forse non avete visto.

Il boom di investimenti nell’ ai generativa

While corporate investment was down overall last year, investment in generative AI went through the roof. Nestor Maslej, editor-in-chief of this year’s report, tells Spectrum that the boom is indicative of a broader trend in 2023, as the world grappled with the new capabilities and risks of generative AI systems like ChatGPT and the image-generating DALL-E 2. “The story in the last year has been about people responding [to generative AI],” says Maslej, “whether it’s in policy, whether it’s in public opinion, or whether it’s in industry with a lot more investment.” Another chart in the report shows that most of that private investment in generative AI is happening in the United States.

 

 

Google sta dominando la corsa dei foundation model. 

Foundation models are big multipurpose models—for example, OpenAI’s GPT-3 and GPT-4 are the foundation model that enable ChatGPT users to write code or Shakespearean sonnets. Since training these models typically requires vast resources, Industry now makes most of them, with academia only putting out a few. Companies release foundation models both to push the state-of-the-art forward and to give developers a foundation on which to build products and services. Google released the most in 2023.

Modelli open source e modelli proprietari

One of the hot debates in AI right now is whether foundation models should be open or closed, with some arguing passionately that open models are dangerous and others maintaining that open models drive innovation. The AI Index doesn’t wade into that debate, but instead looks at trends such as how many open and closed models have been released (another chart, not included here, shows that of the 149 foundation models released in 2023, 98 were open, 23 gave partial access through an API, and 28 were closed).

The chart above reveals another aspect: Closed models outperform open ones on a host of commonly used benchmarks. Maslej says the debate about open versus closed “usually centers around risk concerns, but there’s less discussion about whether there are meaningful performance trade-offs.”

Quanto consuma l’intelligenza artificiale generativa?

The AI Index team also estimated the carbon footprint of certain large language models. The report notes that the variance between models is due to factors including model size, data center energy efficiency, and the carbon intensity of energy grids. Another chart in the report (not included here) shows a first guess at emissions related to inference—when a model is doing the work it was trained for—and calls for more disclosures on this topic. As the report notes: “While the per-query emissions of inference may be relatively low, the total impact can surpass that of training when models are queried thousands, if not millions, of times daily.”

I costi scendono e i ricavi salgono

And here’s why AI isn’t just a corporate buzzword: The same McKinsey survey showed that the integration of AI has caused companies’ costs to go down and their revenues go up. Overall, 42 percent of respondents said they’d seen reduced costs, and 59 percent claimed increased revenue.

Other charts in the report suggest that this impact on the bottom line reflects efficiency gains and better worker productivity. In 2023, a number of studies in different fields showed that AI enabled workers to complete tasks more quickly and produce better quality work. One study looked at coders using Copilot, while others looked at consultants, call center agents, and law students. “These studies also show that although every worker benefits, AI helps lower-skilled workers more than it does high-skilled workers,” says Maslej.

Dove l’Ai non può battere gli esseri umani

In recent years, AI systems have outperformed humans on a range of tasks, including reading comprehension and visual reasoning, and Maslej notes that the pace of AI performance improvement has also picked up. “A decade ago, with a benchmark like ImageNet, you could rely on that to challenge AI researchers for for five or six years,” he says. “Now, a new benchmark is introduced for competition-level mathematics and the AI starts at 30 percent, and then in a year it gets to 90 percent.” While there are still complex cognitive tasks where humans outperform AI systems, let’s check in next year to see how that’s going.

La questione di come si regola l’Ai.

 

When an AI company is preparing to release a big model, it’s standard practice to test it against popular benchmarks in the field, thus giving the AI community a sense of how models stack up against each other in terms of technical performance. However, it has been less common to test models against responsible AI benchmarks that assess such things as toxic language output (RealToxicityPrompts and ToxiGen), harmful bias in responses (BOLD and BBQ), and a model’s degree of truthfulness (TruthfulQA). That’s starting to change, as there’s a growing sense that checking one’s model against theses benchmarks is, well, the responsible thing to do. However, another chart in the report shows that consistency is lacking: Developers are testing their models against different benchmarks, making comparisons harder.

Per approfondire. 

L’intelligenza artificiale, AlphaFold 3 e il senso della vita. L’intervista su Ted a Demis Hassabis

Come si installa e come funziona Phi-3 di Microsoft. La nostra prova

Come funzionano le nuove estensioni di Gemini?