Through a recently completed multi-pronged red teaming effort, the agency said it will develop repeatable testing datasets that can be used to evaluate large language model tools and services in the future.

Posted inUncategorized
Through a recently completed multi-pronged red teaming effort, the agency said it will develop repeatable testing datasets that can be used to evaluate large language model tools and services in the future.