New Anthropic Research Sheds Light on AI's 'Black Box'

Despite the fact that they’re created by humans, large language models are still quite mysterious. The high-octane algorithms that power our current artificial intelligence boom have a way of doing things that aren’t outwardly explicable to the people observing them. This is why AI has largely been dubbed a “black box,” a phenomenon that isn’t easily understood from the outside.

Like It or Not, Your Doctor Will Use AI | AI Unlocked

Newly published research from Anthropic, one of the top companies in the AI industry, attempts to shed some light on the more confounding aspects of AI’s algorithmic behavior. On Tuesday, Anthropic published a research paper designed to explain why its AI chatbot, Claude, chooses to generate content about certain subjects over others.

AI systems are set up in a rough approximation of the human brain—layered neural networks that intake and process information and then make “decisions” or predictions based on that information. Such systems are “trained” on large subsets of data, which allows them to make algorithmic connections. When AI systems output data based on their training, however, human observers don’t always know how the algorithm arrived at that output.

This mystery has given rise to the field of AI “interpretation,” where researchers attempt to trace the path of the machine’s decision-making so they can understand its output. In the field of AI interpretation, a “feature” refers to a pattern of activated “neurons” within a neural net—effectively a concept that the algorithm may refer back to. The more “features” within a neural net that researchers can understand, the more they can understand how certain inputs trigger the net to affect certain outputs.

In a memo on its findings, Anthropic researchers explain how they used a process known as “dictionary learning” to decipher what parts of Claude’s neural network mapped to specific concepts. Using this method, researchers say they were able to “begin to understand model behavior by seeing which features respond to a particular input, thus giving us insight into the model’s ‘reasoning’ for how it arrived at a given response.”

In an interview with Anthropic’s research team conducted by Wired’s Steven Levy, staffers explained what it was like to decipher how Claude’s “brain” works. Once they had figured out how to decrypt one feature, it led to others:

One feature that stuck out to them was associated with the Golden Gate Bridge. They mapped out the set of neurons that, when fired together, indicated that Claude was “thinking” about the massive structure that links San Francisco to Marin County. What’s more, when similar sets of neurons fired, they evoked subjects that were Golden Gate Bridge-adjacent: Alcatraz, California Governor Gavin Newsom, and the Hitchcock movie Vertigo, which was set in San Francisco. All told the team identified millions of features—a sort of Rosetta Stone to decode Claude’s neural net.

New Anthropic Research Sheds Light on AI’s ‘Black Box’

Cooler Master MasterBox Q300L Micro-ATX Tower with Magnetic Design Dust Filter, Transparent Acrylic Side Panel…

ASUS TUF Gaming GT301 ZAKU II Edition ATX mid-Tower Compact case with Tempered Glass Side Panel, Honeycomb Front Panel…

ASUS TUF Gaming GT501 Mid-Tower Computer Case for up to EATX Motherboards with USB 3.0 Front Panel Cases GT501/GRY/WITH…

be quiet! Pure Base 500DX Black, Mid Tower ATX case, ARGB, 3 pre-installed Pure Wings 2, BGW37, tempered glass window

ASUS ROG Strix Helios GX601 White Edition RGB Mid-Tower Computer Case for ATX/EATX Motherboards with tempered glass…

Corsair 5000D Airflow Tempered Glass Mid-Tower ATX PC Case – Black

CORSAIR 7000D AIRFLOW Full-Tower ATX PC Case, Black

Bgears b-Voguish Gaming PC with Tempered Glass ATX Mid Tower, USB3.0, Support E-ATX, ATX, mATX, ITX. (Note: Fan NOT…

Phanteks (PH-EC360ATG_DWT01) Eclipse P360A Ultra-fine Performance Mesh, Mid-Tower case, Tempered Glass, Digital-RGB…

Corsair iCUE 4000X RGB Mid-Tower ATX PC Case – White (CC-9011205-WW)

Hot Turkey Sandwich – Spend With Pennies

Pesto Deviled Eggs – The Stay At Home Chef

Make-Ahead Vegan Green Bean Casserole

Holiday Gift Guide for Teens

Leave a reply Cancel reply

Compare items

Shopping cart