2021-01-05 | Xiaohang Zhao: A Deep Learning Approach to Industry Classification

2021-01-05

Abstract

Industry classificationsystems (ICSs), which identify economically related firms as peer firms, play acentral role in business research and practice. Traditional expert-drivenapproaches manually design ICSs and thus have limitations, including highmaintenance costs and coarse granularity of the identified firm relatedness. Tocircumvent these limitations, recent research takes an algorithm-drivenapproach, employing a bag-of-words method to represent firms’ 10-K reports andleveraging these representations for identifying economically related firms.While firms’ 10-K reports are highly informative for identifying economicallyrelated firms, the bag-of-words method is inadequate for representing thesedocuments, as it ignores the rich semantic information encoded in word contextsand order, resulting in a less effective ICS. Recent developments indeep-learning-based document embedding provide powerful tools for documentrepresentation. However, existing document embedding models (DEMs) are not wellsuited to capture the rich semantics of 10-K reports due to their challengingnature: they are long documents featuring heterogeneous and shifting concepts.We propose a novel DEM to address these challenges; it solves them through aninnovative design of an adaptive gating mechanism and its associated gatingfunction. In addition, we develop a new ICS that takes firms’ 10-K reports asinput, employs the proposed DEM to represent the semantics of these reports,and identifies economically related firms based on similarities between their10-K representations. We demonstrate through extensive empirical evaluationsthat our proposed ICS is superior to representative existing ICSs as well asICSs constructed using state-of-the-art DEMs. This study contributes tobusiness research and practice with a novel ICS that can effectively identifyeconomically related firms. It also contributes to the field ofdeep-learning-based document embedding with an innovative DEM that can capturethe semantics of a broad variety of long documents with shifting concepts, suchas 10-K reports, legal documents, and patent documents.

 

Time

2021-01-05(周二) 09:00-11:00

 

Speaker

Xiaohang Zhao is aPh.D. Candidate in Financial Service Analytics at the Alfred Lerner College ofBusiness & Economics, University of Delaware. His primary research interestis designing novel methods for solving problems in Financial Technology, SocialNetwork Analytics and Health Care Analytics by leveraging tools in DeepLearning, Machine Learning and Natural Language Processing. Xiaohang Zhao holdsa bachelor's degree in Financial Engineering from Renmin University of China.

 

Venue

Zoom会议室ID:93491104511

密码:629349