Progress
List datasets which are supported by SDK and their associated information w.r.t task schema
| Datasets | Updated Date | Task Schema | Normalized State | Comments | Constructor |
|---|---|---|---|---|---|
| govreport | 2022-02-01 | Summarization | Done | Current definition: text, summary | yixinliu |
| duorc | 2022-02-03 | QuestionAnsweringExtractive | Pending | different ids (plot_id, q_id) should be unified | jinlanfu |
| wiki_hop | 2022-02-03 | QuestionAnsweringExtractive | Pending | Two new fields: candidates, annotations | jinlanfu |
| hotpot_qa | 2022-02-03 | QuestionAnsweringHotpot | Pending | Many new fileds (supporting_facts), context will be a json with a list of sentences. | jinlanfu |
| ropes | 2022-02-03 | QuestionAnsweringExtractive | Pending | Many new fileds (situation). | jinlanfu |
| squad_adversarial | 2022-02-03 | QuestionAnsweringExtractive | Done | Current definition:question,context,answers. | jinlanfu |
| quoref | 2022-02-03 | QuestionAnsweringExtractive | Done | Current definition:question,context,answers. | jinlanfu |
| spider | 2022-02-03 | SemanticParsing | Pending | Current definition: question, query | jinlanfu |
| atis | 2022-02-05 | TextClassification | Done | Current definition:text,label | weizhe |
| cr | 2022-02-06 | TextClassification | Done | Current definition:text,label | weizhe |
| mr | 2022-02-06 | TextClassification | Done | Current definition:text,label | weizhe |
| qc | 2022-02-06 | TextClassification | Done | Current definition:text,label | weizhe |
| subj | 2022-02-06 | TextClassification | Done | Current definition:text,label | weizhe |
| afqmc | 2022-02-06 | TextMatching | Done | Current definition:text1,text2, label | zhengfu |
| sst2 | 2022-02-07 | TextClassification | Done | Current definition:text,label | weizhe |
| race | 2022-02-07 | QuestionAnsweringMultipleChoices | Done | Current definition:questions,context,options,answers. Note that (1) some datasets are with/without context. (2) additional exmaple id | jinlanfu |
| drop | 2022-02-08 | QuestionAnsweringAbstractive | Pending | (1) Abstractive QA; (2) answers field has a new feature named (types). | jinlanfu |
| fb15k_237 | 2022-02-09 | KGLinkPrediction | Done | Current definition:head,link, tail | Pengfei |
| restaurant14 | 2022-02-09 | AspectBasedSentimentClassification | Done | Current definition:aspect,text,label | weizhe |
| sst5 | 2022-02-10 | TextClassification | Done | Current definition:text,label | weizhe |
| restaurant16 | 2022-02-10 | AspectBasedSentimentClassification | Done | Current definition:aspect,text,label | weizhe |
| openbookqa | 2022-02-11 | QuestionAnsweringMultipleChoicesWithoutContext | Pending | (1) current field: question, options, answers: text, option_idx; (2) The type of answers.text and answers.option_idx are String not List. | jinlanfu |
| commonsense_qa | 2022-02-11 | QuestionAnsweringMultipleChoicesWithoutContext | Pending | (1) current field: question, options, answers: text, option_idx; (2) The type of answers.text and answers.option_idx are String not List. (3) The test set does not provide annotated answers. | jinlanfu |
| winogrande | 2022-02-11 | QuestionAnsweringMultipleChoicesWithoutContext | Pending | (1) current field: question, options, answers: text, option_idx; (2) The type of answers.text and answers.option_idx are String not List. (3) The test set does not provide annotated answers. | jinlanfu |
| laptop14 | 2022-02-11 | AspectBasedSentimentClassification | Done | Current definition:aspect,text,label | weizhe |
| 2022-02-11 | AspectBasedSentimentClassification | Done | Current definition:aspect,text,label | weizhe | |
| natural_questions | 2022-02-12 | QuestionAnsweringAbstractiveNQ | Pending | (1) current field: question, context, answers. Unlike extraction QA or abstract QA, natural_questions has a complex structure (see the NQ schema definition). (2) The dataset is very large, occupying 135G of disk storage. | jinlanfu |
| ai2_arc | 2022-02-17 | QuestionAnsweringMultipleChoicesWithoutContext | Done | (1) current field: question, options, answers: text, option_idx; | jinlanfu |
| social_i_qa | 2022-02-17 | QuestionAnsweringMultipleChoices | Done | (1) Current definition:questions,context,options,answers. | jinlanfu |
| piqa | 2022-02-17 | QuestionAnsweringMultipleChoicesWithoutContext | Done | (1) current field: question, options, answers: text, option_idx; | jinlanfu |
| codah | 2022-02-17 | QuestionAnsweringMultipleChoicesWithoutContext | Pending | (1) current field: question, options, answers: text, option_idx; (2) There is a new but important field question_category. | jinlanfu |
| qasc | 2022-02-17 | QuestionAnsweringMultipleChoicesQASC | Pending | (1) Current definition:questions,context,options,answers. (2) The test set has no labeled answers. (3) context is a dictionary with fields fact1, fact2 and combinedfact. (4) qasc has new field named formatted_question. | jinlanfu |
| wikihow | 2022-02-17 | Summarization | Done | Current definition: text, summary | yixinliu |
| wikisum | 2022-02-17 | Summarization | Done | Current definition: text, summary | yixinliu |
| reddit_tifu | 2022-02-17 | Summarization | Done | Current definition: text, summary | yixinliu |
| bigpatent | 2022-02-17 | Summarization | Done | Current definition: text, summary | yixinliu |
| multi_xscience | 2022-02-17 | Summarization, MultiDocSummarization | Done | Current definition: (1) Summarization: text, summary, (2) MultiDocSummarization: texts, summary | yixinliu |
| multinews | 2022-02-17 | Summarization, MultiDocSummarization | Done | Current definition: (1) Summarization: text, summary, (2) MultiDocSummarization: texts, summary | yixinliu |
| dialogsum | 2022-02-17 | Summarization, DialogSummarization | Done | Current definition: (1) Summarization: text, summary, (2) DialogSummarization: dialogue: {"speaker": List[str], "text": List[str]}, summary: List[str] | yixinliu |
| samsum | 2022-02-17 | Summarization, DialogSummarization | Done | Current definition: (1) Summarization: text, summary, (2) DialogSummarization: dialogue: {"speaker": List[str], "text": List[str]}, summary: List[str] | yixinliu |
| qmsum | 2022-02-17 | Summarization, QuerySummarization | Done | Current definition: (1) Summarization: text, summary, (2) QuerySummarization: text, summary, query | yixinliu |