Defining Custom Metrics for Precise AI Function Evaluation