👋 Welcome to RefineBench — a comprehensive evaluation library for testing refinement capabilities of language models across multiple settings and domains. To reproduce the full results reported in ...
Abstract: The development of effective algorithms for removing surgical smoke in laparoscopic surgery has been hindered by the absence of a paired dataset containing real smoky and smoke-free surgical ...
Abstract: This paper compares synthetic and real-world code datasets for machine learning applications in cybersecurity by examining the relationships between machine code and Low-Level Virtual ...
Some results have been hidden because they may be inaccessible to you
Show inaccessible results