Amazon Science’s Post

Amazon researchers developed a new architecture that reduces a foundation model's inference time by 30% while maintaining its accuracy. Like specialized regions in the brain, this new system selects appropriate subsets of neurons depending on the task: https://amzn.to/3IKPA6p

  • diagram
Nathan Susanj

Applied Science Manager at Amazon

1w

Really great work from the team! As AI applications grow, more cost will move from training to inference and ideas like these to dynamically allocate compute based on the complexity of the task could play a large role in making AI applications sustainable in terms of cost.

See more comments

To view or add a comment, sign in

Explore topics