betvictot官网(中国)有限公司

2017-12-11 | 软件自动化与自然语言处理技术交叉研讨会

2017-12-11

▊▎时间Time

2017年12月11日（周一）

上午 8：30 - 11：30，下午13：30 - 16：30

▊▎地点Venue

武东路100号betvictot官网

betvictot官网 308会议室

▊▎日程Program

上午8:30 - 9:10

需求跟踪应用过程中的灰色链接

牛楠（Nan Niu），美国辛辛那提大学助理教授

上午9:10 - 9:50

面向智能化软件开发的代码大数据分析

彭鑫，复旦大学教授，软件学院副经理

上午10:10 - 10:50

软件产品线软件需求的自动分析方法

王英林，betvictot官网教授，计算机科学与技术系主任

上午10:50 - 11:30

基于回归模型的开源软件安全需求识别方法

王文韬，美国辛辛那提大学博士研究生

下午13:30 -14:10

语义分析：从语法依赖到深度学习

赵海，上海交通大学计算机科学与工程系副教授

下午14:10 -14:50

基于知识图谱的机器语言认知关键技术及应用

肖仰华，复旦大学计算机科学与软件学院副教授

下午15:10 - 15:30

汽车评论细粒度挖掘

王明，betvictot官网硕博连读研究生

下午15:30 - 16:00 自由讨论时间

主办：betvictot官网计算机科学与技术系、SUFE智能系统研究组

赞助：betvictot官网、国家自然科学基金委员会

▊▎主旨Purpose

软件是人类知识与智能的结晶，大数据时代软件的作用日益重要也更加复杂，软件的自动化、智能化成为人们关注的目标。同时，自然语言作为人类知识的主要载体，对其进行自动分析是人们多年来追求的目标，近年来随着语料库的不断丰富、经验与模型的积累以及计算能力的不断提升，NLP已有相当大进展，成为人工智能研究和应用的热点。在软件开发过程中，由于需求、业务逻辑以及案例通常用自然语言来表述，软件自动化与自然语言处理有着自然结合点。本次研讨会，邀请到美国辛辛那提大学、复旦大学、上海交通大学、betvictot官网等领域专家介绍他们在软件自动化分析，以及自然语言处理方面的经验，本次研讨会也邀请到辛辛那提大学、betvictot官网两位博士研究生分享其研究体会。

交流机会难得，欢迎您前来参加！

▊▎简介Introduction

学术报告一

题目：需求跟踪应用过程中的灰色链接

报告人：牛楠（Nan Niu），美国辛辛那提大学助理教授

时间：上午8:30 - 9:10

摘要: The value of traceability is in its use. How do different software engineering tasks affect the tracing of the same requirement? In this work, we answer the question via an empirical study where we explicitly assign the participants into 3 trace-usage groups of one requirement: finding its implementation for verification and validation purpose, changing it within the original software system, and reusing it toward another application. The results uncover what we call "gray links" -- around 20% of the total traces are voted to be true links with respect to only one task but not the others. We discuss how our findings might impact the intelligent recognition of developers' tasks and contexts when traceability is created, maintained, and used.

报告人简介：Nan Niu is an assistant professor with the Department of Electrical Engineering and Computer Science, University of Cincinnati, USA. He received the B.Eng. degree from the Beijing Institute of Technology, the M.Sc. degree from the University of Alberta, and the Ph.D. degree from the University of Toronto, all in computer science. His current research interests include software requirements engineering, information seeking in software engineering, and human-centered computing. Dr. Niu is a recipient of the U.S. National Science Foundation Faculty Early Career Development (CAREER) Award and the best research paper award at the IEEE International Requirements Engineering Conference (RE 2016).

学术报告二

题目：面向智能化软件开发的代码大数据分析

报告人：彭鑫，复旦大学教授，软件学院副经理

时间：上午9:10 - 9:50

摘要: 开源软件社区、软件开发问答网站、在线开发手册等网络化资源所构成的代码大数据为数据驱动的智能化软件开发打下了基础。为了有效地支持代码检索、推荐和生成等智能化开发要求，我们需要开发能够深入理解代码片段、API 等代码制品含义以及开发者问题描述和软件开发上下文的代码大数据分析技术。然而，问题描述、上下文和代码制品三者之间的语义和词汇鸿沟使得这一过程充满了挑战。本报告将分析面向智能化软件开发的代码大数据分析的主要问题和挑战，提出一个基于程序分析、统计学习和知识推理的代码大数据分析技术框架，并介绍当前的研究进展。

报告人简介：彭鑫，复旦大学软件学院副经理、博士、教授、博士生导师。中国计算机学会（CCF）高级会员及软件工程专委会委员、CCF YOCSEF上海2016-2017主席、CCF上海分部执委、上海市计算机学会青工委副主任、《软件学报》编委、《计算机工程与应用》编委。2001年和2006年分别获得复旦大学计算机科学与技术专业学士和博士学位，博士毕业后留校任教。2013年入选复旦大学卓学计划，2016年获得东软-NASAC青年软件创新奖。主要研究方向包括软件维护与演化、智能化软件开发、自适应软件、移动计算与云计算等。作为负责人承担自然科学基金项目3项、863项目子课题2项、重点研发计划项目子课题1项。至今已在ICSE、FSE、ASE、CSCW、RE、ICSME、SANER等国际会议以及ACM Transactions on Internet Technology、IEEE Transactions on Service Computing、IEEE Software、计算机学报、软件学报、中国科学等国内外期刊上发表论文60余篇，担任了ICSME、RE、COMPSAC等国际会议的程序委员会委员。研究工作获得第27届软件维护国际会议（ICSM 2011）最佳论文奖，所开发的软件工具多次获得中国计算机学会举办的软件研究成果原型竞赛一、二、三等奖。

学术报告三

题目：软件产品线软件需求的自动分析方法

报告人：王英林，betvictot官网教授，计算机科学与技术系主任

时间：上午10:10 - 10:50

摘要:

A software product line (SPL) is a set of software-intensive systems that share a common, managed set of features satisfying the specific needs of a particular application domain and that are developed from a common set of core assets in a prescribed way. To build a software product line, the most important tasks that should be done beforehand is the analysis of the requirements of different systems or users of the particular domain. Requirement analysis for SPL aims at finding the common and variability model of requirements in a domain.

Software requirement specifications are usually expressed in natural language, which are informal, imprecise and ambiguous, thus analyzing them automatically is a challenging task. Moreover, as the requirement specifications from different users (or systems) are usually heterogeneous, hence the mapping between them should be done before further analysis. Nowadays, although methods towards automatic analysis of software requirements have been studied before, many of them have limitations which hinder their real applications, and effective researches in this area are still lacking. Hence in practice the analysis and mapping between multi-users' heterogeneous requirement documents is still done by humans. This leads to a lot of manpower, low efficiency and even error prone.

In response to the above problems, this talk will discuss the state of art in automatic analysis of software requirements for software product lines. We will discuss the main problems, concerns, and the existing approaches. And then we will focus on some new promising approaches that use natural language processing techniques, semantic analysis and machine learning methods. The framework, tools and experiments will be discussed in detail.

报告人简介：Yinglin Wang is a professor of School of Information Management and Engineering at Shanghai University of Finance and Economics. He got his Ph. D degree and Master degree from Nanjing University of Science and Technology in 1998, and 1992 respectively. He worked as a professor at Department of Computer Science and Engineering in Shanghai Jiao Tong University for many years before 2014. He has been a visiting professor at Stanford University in 2005. His current research interest includes software requirement analysis, machine learning and big data analysis. He conducted and completed many national funded projects of China in the related area since 1998. He has published huge number of papers in well-known journals and conference proceedings. He is an associate editor of IJSEKE, and he served as conference chairs, program chairs, and program committee members for many international conferences.

学术报告四

题目：基于回归模型的开源软件安全需求识别方法

报告人：王文韬，美国辛辛那提大学博士研究生

时间：上午10:50 - 11:30

摘要: There are several security requirements identification methods proposed by researchers in up-front requirements engineering (RE). However, in open source software (OSS) projects, developers use lightweight representation and refine requirements frequently by writing comments. They also tend to discuss security aspect in comments by providing code snippets, attachments, and external resource links. In our work, we propose a new model based on logistic regression to identify security requirements in OSS projects. We use five metrics to build security requirements identification models and tested the performance of these metrics by applying those models to three OSS projects. Our results show that four out of five metrics achieved high performance in intra-project testing.

报告人简介：Wentao Wang（王文韬） received the B.Sc. degree from Shanghai Maritime University, the M.Eng. degree from Beijing Institute of Technology, and is currently a Ph.D. student in the Department of Electrical Engineering and Computer Science at the University of Cincinnati. His research interests include software requirements engineering, information seeking in software engineering, and information retrieval.

学术报告五

题目：语义分析：从语法依赖到深度学习

报告人：赵海，上海交通大学计算机科学与工程系副教授

时间：上午13:30 -14:10

摘要: 早期的语义分析或称语义角色标注完全依赖于全解析的句法树特征输入或者结构信息否则将会导致难以接受的性能失落。最近三年的深度学习引入显示了不依赖于句法树信息的端对端模型可以带来可观的分析性能展示了句法独立的语义分析的可能性。

报告人简介：上海交通大学计算机科学与工程系副教授、博士生导师。研究领域包括自然语言处理和相关深度学习。发表论文近100篇，其中近年来CCF-A论文11篇,SCI近20余篇，以及相关顶级计算语言学会议论文30篇。Google scholar引用计数超过1,500次。ACL Anthology索引库（19000名入库作者）中，引用和H-index排名前2-4%。ACM专业会员，中文信息学会青工委委员、中国计算机学会中文信息处理专委会委员，上海市计算机学会人工智能专委副主任,2014-2017年PACLIC指导委员会委员。ACL/EMNLP/EACL、NAACL、COLING、AAAI 等程序委员会成员。ACL-2016的出版事务主席、ACL-2017程序委员会Parsing领域主席，ACL-2018程序委员会的形态和分词领域主席。

学术报告六

题目：基于知识图谱的机器语言认知关键技术及应用

报告人：肖仰华，复旦大学计算机科学与软件学院副教授

时间：上午14:10 -14:50

摘要: 近年来大数据和人工智能技术飞速发展。以知识图谱为代表的大规模知识库构建以及应用技术，以深度学习为代表的类人化学习技术成为近期的热点研究问题。知识图谱、深度学习等技术的飞速发展，使得机器理解人类语言日益可能，并在智慧搜索、机器智脑以及智慧商务等一系列实际应用中日益彰显其应用价值。本报告系统介绍复旦大学知识图谱研究小组在机器语言认知方面的研究进展，以及相关技术的落地应用。

报告人简介：肖仰华，复旦大学计算机科学系副教授，复旦大学知识工程研究实验室主任，上海互联网大数据工程中心的副主任。2009年在复旦大学获得软件理论博士学位。曾获阿里巴巴研究奖及数十项国家或地方研究奖，担任若干中国顶尖大数据公司或AI公司的首席科学家或高级顾问，是30 +由国家和地方机构或包括微软，IBM，百度，阿里巴巴，腾讯，华为，中国电信、中国移动、XiaoI Robot等大公司资助项目的负责人。他还也是一家AI初创公司的创始人。主要研究方向为大数据管理与挖掘、图形数据库、知识图。在包括TKDE，SIGMOD，VLDB、ICDE，IJCAI、AAAI在内的顶级国际期刊和会议上发表论文100多篇。10 +国家和地方的资助机构的评审人，以及50 多个包括IJCAI、AAAI，ICDE，CIKM 等国际会议的PC成员。Frontier of Computer Science期刊副主编，也是10多家顶级期刊的审稿人。ACM，IEEE会员，AAAI和CCF高级会员。他和他的团队建立并发布了中国最大的知识图谱CN DBpedia，和最大的中文概念图CN probase。他还建立了第一个中国知识服务平台（kw.fudan.edu.cn)，该平台已经为工业界提供了6.5亿次API调用。

学术报告七

题目：汽车评论细粒度挖掘

报告人：王明，betvictot官网硕博连读研究生

时间：上午15:10 - 15:30

摘要: 随着电子商务的迅猛发展，大量的评论数据能够在短时间被获取到。基于此，评论挖掘在抽取有关产品设计，改进和品牌营销等方面的信息上愈发流行和有效，尤其是利用细粒度的评论挖掘。然而，受限于评论的非结构化形式和低标准化表达，想要获取有价值的信息并不方便，在利用非监督方法时更加明显。为此，我们基于贴吧获取的汽车评论数据，通过定义了一套针对汽车评论不同方面信息的概念框架，提出一个整合的策略来对相关评论进行标注，基于训练样例集，运用机器学习方法对未标注语料进行自动标注，初步实验取得较好效果。本报告将阐述采用的方法，分析现存问题，并探讨后续可能的改进路径。

报告人简介：本科毕业于上海大学，多次获得品学兼优奖学金和国家励志奖学金，获得过 “上海大学优秀员工”以及 “上海大学优秀毕业生”称号。现为betvictot官网硕博连读生，参加研究生数学建模竞赛，获得三等奖。于2017年7月发表国际会议IEA/AIE2017论文一篇。主要研究方向包括文本挖掘，知识图谱，问答系统。

自由讨论时间：下午15:30 - 16:00

0001