FinanceLane
  • Funding
    • Equity Funding
    • Debt Funding
    • Crowdfunding
    • Real Estate Funding
  • Investing
    • Stocks
    • Bonds
    • Mutual Funds
    • Commodities
    • Forex
    • Private Equity
    • Real Estate
    • Crypto Investing
  • Lending
    • Personal Loan
    • Business Loan
    • Mortgage
    • Credit Card
    • Microfinance
    • Peer-to-Peer Lending
  • Insurance
    • Life Insurance
    • Health Insurance
    • Auto Insurance
    • Education Insurance
    • General Insurance
  • Banking
    • Individual Banking
    • Business Banking
    • Investment Banking
    • Neo Banking
    • Payments Bank
  • Wealth
    • Earning
    • Savings
    • Investments
    • Budgeting
    • Credit Management
    • Tax Planning
    • Retirement
  • Fintech
    • Payments
    • Digital Banks
    • Alternative Financing
    • Asset Management
    • Softwares
  • Startup
    • Startup Ecosystem
    • Merging & Acquisition
    • Equity Investing
    • Franchising
    • Business Offers
  • Crypto
    • Crypto Coins
    • Crypto Trading
    • Bitcoin
    • Blockchain
    • DAPP
    • Crypto Investing
  • Login
No Result
View All Result
FinanceLane
  • Home
  • Funding
  • Investing
  • Lending
  • Insurance
  • Banking
  • Wealth
  • Crypto
  • Newsletters
  • Feedback
Home News Feed Blockchain News

OpenEvals Simplifies LLM Evaluation Process for Developers

Blockchainby Blockchain
February 26, 2025

Zach Anderson Feb 26, 2025 12:07

LangChain introduces OpenEvals and AgentEvals to streamline evaluation processes for large language models, offering pre-built tools and frameworks for developers.

OpenEvals Simplifies LLM Evaluation Process for Developers

LangChain, a prominent player in the field of artificial intelligence, has launched two new packages, OpenEvals and AgentEvals, aimed at simplifying the evaluation process for large language models (LLMs). These packages provide developers with a robust framework and a set of evaluators to streamline the assessment of LLM-powered applications and agents, according to LangChain.

Understanding the Role of Evaluations

Evaluations, often referred to as evals, are crucial in determining the quality of LLM outputs. They involve two primary components: the data being evaluated and the metrics used for evaluation. The quality of the data significantly impacts the evaluation’s ability to reflect real-world usage. LangChain emphasizes the importance of curating a high-quality dataset tailored to specific use cases.

The metrics for evaluation are typically customized based on the application’s goals. To address common evaluation needs, LangChain developed OpenEvals and AgentEvals, sharing pre-built solutions that highlight prevalent evaluation trends and best practices.

Common Evaluation Types and Best Practices

OpenEvals and AgentEvals focus on two main approaches to evaluations:

  1. Customizable Evaluators: The LLM-as-a-judge evaluations, which are widely applicable, allow developers to adapt pre-built examples to their specific needs.
  2. Specific Use Case Evaluators: These are designed for particular applications, such as extracting structured content from documents or managing tool calls and agent trajectories. LangChain plans to expand these libraries to include more targeted evaluation techniques.

LLM-as-a-Judge Evaluations

LLM-as-a-judge evaluations are prevalent due to their utility in assessing natural language outputs. These evaluations can be reference-free, enabling objective assessment without needing ground truth answers. OpenEvals aids this process by providing customizable starter prompts, incorporating few-shot examples, and generating reasoning comments for transparency.

Structured Data Evaluations

For applications that require structured output, OpenEvals offers tools to ensure the model’s output adheres to a predefined format. This is crucial for tasks such as extracting structured information from documents or validating parameters for tool calls. OpenEvals supports exact match configuration or LLM-as-a-judge validation for structured outputs.

Agent Evaluations: Trajectory Evaluations

Agent evaluations focus on the sequence of actions an agent takes to accomplish a task. This involves assessing tool selection and the trajectory of applications. AgentEvals provides mechanisms to evaluate and ensure agents are using the correct tools and following the appropriate sequence.

Tracking and Future Developments

LangChain recommends using LangSmith for tracking evaluations over time. LangSmith offers tools for tracing, evaluation, and experimentation, supporting the development of production-grade LLM applications. Notable companies like Elastic and Klarna utilize LangSmith to evaluate their GenAI applications.

LangChain’s initiative to codify best practices continues, with plans to introduce more specific evaluators for common use cases. Developers are encouraged to contribute their own evaluators or suggest improvements via GitHub.

Image source: Shutterstock Read The Original Article on Blockchain.News

Tags: EVALUATIONLANGCHAINLLMNewsOPENEVALS

Related Topics

Advisory

Here’s how you can protect your turf at work

Advisory

What should FD investors do now? RBI cuts repo rate by 50 bps, interest rates will fall further

Prev Next

You May Like

Advisory

Here’s how you can protect your turf at work

Advisory

What should FD investors do now? RBI cuts repo rate by 50 bps, interest rates will fall further

Advisory

Big savings for home loan borrowers as EMIs to fall significantly after RBI cuts repo rate by 50 bps

Advisory

Bakrid bank holiday today: Are banks open or closed in your state on June 6, 2025 for Id-ul-Ad’ha 2025

Advisory

HDFC Bank UPI and other services won’t be available on this date: Check details here

Advisory

Waiting list train ticket? Get ticket confirmation assurance with up to 3x money back guarantee from Ixigo, Redbus and MakeMyTrip

Advisory

Bank holiday on June 6, 2025 and June 7, 2025: Are banks closed tomorrow in your state for Bakrid?

Advisory

5 things you’re probably doing, that are pushing away success at your job

Financial News

Blockchain News

Gala Games Launches Exclusive Wild Jungle Expedition Mystery Box

Blockchain
by Blockchain
Banking

Cryptocurrency’s Impact on Traditional Banking: Insights from Andorra

Blockchain
by Blockchain
Blockchain News

Gala Games Reintroduces Popular VEXI Costumes with May Discounts

Blockchain
by Blockchain
Advisory

Beyond BTC: Top Altcoins to buy with growth potential

FinanceLane
by FinanceLane
Advisory

RIL shareholders: How to claim your unpaid dividends before they’re transferred to IEPF

FinanceLane
by FinanceLane
Blockchain News

Enhancing RAG Pipelines with Ray and Anyscale for Scalable AI Solutions

Blockchain
by Blockchain
Blockchain News

NVIDIA Pioneers AI-Centric Data Centers for the 5th Industrial Revolution

Blockchain
by Blockchain
Advisory

How minors can apply for a PAN card; is the process different if you are an NRI minor? Here’s a step-by-step guide

FinanceLane
by FinanceLane
Blockchain

Ethereum Foundation Welcomes Hsiao-Wei Wang to Board of Directors

Blockchain
by Blockchain
Advisory

Want to save on paying taxes on capital gains? Don’t utilise 54EC bonds in a hurry; check for this notice before taking the call

FinanceLane
by FinanceLane
Advisory

Good Friday bank holiday tomorrow: Are banks open or closed in your state on April 18, 2025?

FinanceLane
by FinanceLane
Advisory

American Express launches a new drive in its reward multiplier and membership programs for limited period; check the details

FinanceLane
by FinanceLane
Load More
FinanceLane.com
  • Disclaimer
  • Privacy Policy
  • Terms of use
  • Subscribe
  • Contact

Subscribe to get the latest updates

Follow us on

© 2022 FinanceLane.com. All rights reserved.

Welcome Back!

Sign In with Facebook
Sign In with Google
Sign In with Linked In
OR

Login to your account below

Forgotten Password?

Retrieve your password

Please enter your username or email address to reset your password.

Log In
  • Home
  • Funding
    • Equity Funding
    • Debt Funding
    • Real Estate Funding
    • Crowdfunding
  • Investing
    • Stocks
    • Bonds
    • Mutual Funds
    • Private Equity
    • Merging & Acquisition
    • Real Estate
  • Lending
    • Personal Loan
    • Business Loan
    • Credit Card
    • Microfinance
    • Peer-to-Peer Lending
  • Insurance
    • Life Insurance
    • Auto Insurance
    • Education Insurance
    • Health Insurance
  • Banking
    • Business Banking
    • Payments Bank
    • Investment Banking
    • Individual Banking
  • Wealth
    • Earning
    • Savings
    • Investments
    • Budgeting
    • Credit Management
    • Tax Planning
    • Retirement
  • Fintech
    • Alternative Financing
    • Payments
    • Asset Management
    • Digital Banks
    • Softwares
  • Fintech
    • Alternative Financing
    • Asset Management
    • Digital Banks
    • Softwares
    • Payments
  • Crypto
    • Crypto Investing
    • Crypto Trading
    • Crypto Coins
    • Bitcoin
    • Blockchain
    • DAPP
  • Subscribe
  • Contact
  • Login

© 2022 FinanceLane - Terms and Conditions | Disclaimer | Privacy Policy

This website uses cookies. By continuing to use this website you are giving consent to cookies being used. Visit our Privacy and Cookie Policy.