This repo contains a diagnostic evaluation benchmark toward the robustness of text-to-SQL models, which contains 17 perturbation test sets to measure the robustness of models from different angles. It ...
Some results have been hidden because they may be inaccessible to you
Show inaccessible results