\n","children":[{"type":"text","text":""}]},{"type":"p","children":[{"type":"text","text":"In this paper, we propose a robust method for outlier removal to improve the\nperformance for image classification.","italic":true}]},{"type":"p","children":[{"type":"text","text":"Authors: Matthew Hagen, Ala Eddine Ayadi, Jiaqi Wang, Nikolaos Vasiloglou,\nEstelle Afshar. 2019."}]},{"type":"p","children":[{"type":"text","text":"In KDD 2019 Workshop on Data Collection, Curation, and Labeling for Mining and\nLearning (DCCL, KDD ‘19).","italic":true}]},{"type":"p","children":[{"type":"text","text":"In this paper, we propose a robust method for outlier removal to improve the\nperformance for image classification. Increasing the size of training data does\nnot necessarily raise prediction accuracy, due to instances that may be poor\nrepresentatives of their respective classes. Four separate experiments are\ntested to evaluate the effectiveness of outlier removal for several classifiers.\nEmbeddings are generated from a pre-trained neural network, a fine-tuned\nnetwork, as well as a Siamese network. Subsequently, outlier detection is\nevaluated based on clustering quality and classifier performance from a\nfully-connected feed-forward network, K-Nearest Neighbors and gradient boosting\nmodel."}]},{"type":"p","children":[{"type":"text","text":"Read the PDF:\n"},{"type":"a","url":"https://www.researchgate.net/publication/334327503_Optimizing_Training_Data_for_Image_Classifiers","title":null,"children":[{"type":"text","text":"Optimizing Training Data for Image Classifiers"}]}]}],"_content_source":{"queryId":"src/content/resources/optimizing-training-data-for-image-classifiers.mdx","path":["resource","body"]}},"_content_source":{"queryId":"src/content/resources/optimizing-training-data-for-image-classifiers.mdx","path":["resource"]}}},"errors":null,"query":"\n query resource($relativePath: String!) {\n resource(relativePath: $relativePath) {\n ... on Document {\n _sys {\n filename\n basename\n breadcrumbs\n path\n relativePath\n extension\n }\n id\n }\n ...ResourceParts\n }\n}\n \n fragment ResourceParts on Resource {\n __typename\n title\n description\n date\n image\n categories\n authors {\n __typename\n name\n link\n }\n seo {\n __typename\n keywords\n description\n image\n image_alt\n canonical_url\n author\n published\n modified\n language\n robots\n site_name\n content_type\n }\n body\n}\n ","variables":{"relativePath":"optimizing-training-data-for-image-classifiers.mdx"}},"src/content/meta/meta.md":{"data":{"meta":{"_sys":{"filename":"meta","basename":"meta.md","breadcrumbs":["meta"],"path":"src/content/meta/meta.md","relativePath":"meta.md","extension":".md"},"id":"src/content/meta/meta.md","__typename":"Meta","banner":{"__typename":"MetaBanner","enabled":true,"content":{"type":"root","children":[{"type":"p","children":[{"type":"text","text":"Check out "},{"type":"a","url":"/resources/highlights-of-relationalai-at-snowflake-data-cloud-summit-2024","title":"SF summit highlights","children":[{"type":"text","text":"highlights"}]},{"type":"text","text":" of RelationalAI at "},{"type":"text","text":"Snowflake's Data Cloud Summit 2024!","bold":true}]}],"_content_source":{"queryId":"src/content/meta/meta.md","path":["meta","banner","content"]}},"_content_source":{"queryId":"src/content/meta/meta.md","path":["meta","banner"]}},"header":{"__typename":"MetaHeader","links":[{"__typename":"MetaHeaderLinks","text":"Product","url":"/product","style":"default","_content_source":{"queryId":"src/content/meta/meta.md","path":["meta","header","links",0]}},{"__typename":"MetaHeaderLinks","text":"Company","url":"/company","style":"default","_content_source":{"queryId":"src/content/meta/meta.md","path":["meta","header","links",1]}},{"__typename":"MetaHeaderLinks","text":"Docs","url":"/docs","style":"default","_content_source":{"queryId":"src/content/meta/meta.md","path":["meta","header","links",2]}},{"__typename":"MetaHeaderLinks","text":"Resources","url":"/resources/all/1","style":"default","_content_source":{"queryId":"src/content/meta/meta.md","path":["meta","header","links",3]}},{"__typename":"MetaHeaderLinks","text":"Get Started","url":"/get-started","style":"cta","_content_source":{"queryId":"src/content/meta/meta.md","path":["meta","header","links",4]}}],"_content_source":{"queryId":"src/content/meta/meta.md","path":["meta","header"]}},"footer":{"__typename":"MetaFooter","sections":[{"__typename":"MetaFooterSections","name":"Product","links":[{"__typename":"MetaFooterSectionsLinks","text":"Overview","url":"/product","_content_source":{"queryId":"src/content/meta/meta.md","path":["meta","footer","sections",0,"links",0]}},{"__typename":"MetaFooterSectionsLinks","text":"Use Cases","url":"/product#for-problems-that-matter","_content_source":{"queryId":"src/content/meta/meta.md","path":["meta","footer","sections",0,"links",1]}},{"__typename":"MetaFooterSectionsLinks","text":"Capabilities","url":"/product#a-new-toolset","_content_source":{"queryId":"src/content/meta/meta.md","path":["meta","footer","sections",0,"links",2]}}],"_content_source":{"queryId":"src/content/meta/meta.md","path":["meta","footer","sections",0]}},{"__typename":"MetaFooterSections","name":"Resources","links":[{"__typename":"MetaFooterSectionsLinks","text":"Documentation","url":"/docs/getting_started","_content_source":{"queryId":"src/content/meta/meta.md","path":["meta","footer","sections",1,"links",0]}},{"__typename":"MetaFooterSectionsLinks","text":"News","url":"/resources/news/1","_content_source":{"queryId":"src/content/meta/meta.md","path":["meta","footer","sections",1,"links",1]}},{"__typename":"MetaFooterSectionsLinks","text":"Research","url":"/resources/research/1","_content_source":{"queryId":"src/content/meta/meta.md","path":["meta","footer","sections",1,"links",2]}},{"__typename":"MetaFooterSectionsLinks","text":"Releases","url":"/resources/releases/1","_content_source":{"queryId":"src/content/meta/meta.md","path":["meta","footer","sections",1,"links",3]}}],"_content_source":{"queryId":"src/content/meta/meta.md","path":["meta","footer","sections",1]}},{"__typename":"MetaFooterSections","name":"About Us","links":[{"__typename":"MetaFooterSectionsLinks","text":"Our Company","url":"/company","_content_source":{"queryId":"src/content/meta/meta.md","path":["meta","footer","sections",2,"links",0]}},{"__typename":"MetaFooterSectionsLinks","text":"Contact Us","url":"/get-started","_content_source":{"queryId":"src/content/meta/meta.md","path":["meta","footer","sections",2,"links",1]}},{"__typename":"MetaFooterSectionsLinks","text":"Careers","url":"/careers","_content_source":{"queryId":"src/content/meta/meta.md","path":["meta","footer","sections",2,"links",2]}},{"__typename":"MetaFooterSectionsLinks","text":"Legal","url":"/legal","_content_source":{"queryId":"src/content/meta/meta.md","path":["meta","footer","sections",2,"links",3]}},{"__typename":"MetaFooterSectionsLinks","text":"GDPR","url":"/gdpr","_content_source":{"queryId":"src/content/meta/meta.md","path":["meta","footer","sections",2,"links",4]}},{"__typename":"MetaFooterSectionsLinks","text":"Security & Trust","url":"https://trust.relational.ai/","_content_source":{"queryId":"src/content/meta/meta.md","path":["meta","footer","sections",2,"links",5]}}],"_content_source":{"queryId":"src/content/meta/meta.md","path":["meta","footer","sections",2]}}],"socials":[{"__typename":"MetaFooterSocials","text":"GitHub","url":"https://github.com/RelationalAI","icon":"https://assets.tina.io/91d76337-e55d-4722-acb5-3106adb895b6/img/logos/github.png","_content_source":{"queryId":"src/content/meta/meta.md","path":["meta","footer","socials",0]}},{"__typename":"MetaFooterSocials","text":"LinkedIn","url":"https://www.linkedin.com/company/relationalai/about","icon":"https://assets.tina.io/91d76337-e55d-4722-acb5-3106adb895b6/img/logos/linkedin.png","_content_source":{"queryId":"src/content/meta/meta.md","path":["meta","footer","socials",1]}},{"__typename":"MetaFooterSocials","text":"Twitter","url":"https://twitter.com/relationalai","icon":"https://assets.tina.io/91d76337-e55d-4722-acb5-3106adb895b6/img/logos/twitter.png","_content_source":{"queryId":"src/content/meta/meta.md","path":["meta","footer","socials",2]}}],"_content_source":{"queryId":"src/content/meta/meta.md","path":["meta","footer"]}},"_content_source":{"queryId":"src/content/meta/meta.md","path":["meta"]}}},"errors":null,"query":"\n query meta($relativePath: String!) {\n meta(relativePath: $relativePath) {\n ... on Document {\n _sys {\n filename\n basename\n breadcrumbs\n path\n relativePath\n extension\n }\n id\n }\n ...MetaParts\n }\n}\n \n fragment MetaParts on Meta {\n __typename\n banner {\n __typename\n enabled\n content\n }\n header {\n __typename\n links {\n __typename\n text\n url\n style\n }\n }\n footer {\n __typename\n sections {\n __typename\n name\n links {\n __typename\n text\n url\n }\n }\n socials {\n __typename\n text\n url\n icon\n }\n }\n}\n ","variables":{"relativePath":"./meta.md"}}};
globalThis.tina_info = tina;
})();
Optimizing Training Data for Image Classifiers · RelationalAI
Check out highlights of RelationalAI at Snowflake's Data Cloud Summit 2024!
In this paper, we propose a robust method for outlier removal to improve the
performance for image classification.
Authors: Matthew Hagen, Ala Eddine Ayadi, Jiaqi Wang, Nikolaos Vasiloglou,
Estelle Afshar. 2019.
In KDD 2019 Workshop on Data Collection, Curation, and Labeling for Mining and
Learning (DCCL, KDD ‘19).
In this paper, we propose a robust method for outlier removal to improve the
performance for image classification. Increasing the size of training data does
not necessarily raise prediction accuracy, due to instances that may be poor
representatives of their respective classes. Four separate experiments are
tested to evaluate the effectiveness of outlier removal for several classifiers.
Embeddings are generated from a pre-trained neural network, a fine-tuned
network, as well as a Siamese network. Subsequently, outlier detection is
evaluated based on clustering quality and classifier performance from a
fully-connected feed-forward network, K-Nearest Neighbors and gradient boosting
model.