Content deleted Content added
Citation bot (talk | contribs) Alter: title. Add: date, chapter. Removed parameters. Some additions/deletions were parameter name changes. | Use this bot. Report bugs. | #UCB_CommandLine |
m Disambiguating links to Facility ___location problem (link changed to Optimal facility ___location) using DisamAssist. |
||
Line 122:
===Submodular functions as generic tools for summarization===
The idea of a [[submodular set function]] has recently emerged as a powerful modeling tool for various summarization problems. Submodular functions naturally model notions of ''coverage'', ''information'', ''representation'' and ''diversity''. Moreover, several important [[combinatorial optimization]] problems occur as special instances of submodular optimization. For example, the [[set cover problem]] is a special case of submodular optimization, since the set cover function is submodular. The set cover function attempts to find a subset of objects which ''cover'' a given set of concepts. For example, in document summarization, one would like the summary to cover all important and relevant concepts in the document. This is an instance of set cover. Similarly, the [[Optimal facility ___location|facility ___location problem]] is a special case of submodular functions. The Facility Location function also naturally models coverage and diversity. Another example of a submodular optimization problem is using a [[determinantal point process]] to model diversity. Similarly, the Maximum-Marginal-Relevance procedure can also be seen as an instance of submodular optimization. All these important models encouraging coverage, diversity and information are all submodular. Moreover, submodular functions can be efficiently combined, and the resulting function is still submodular. Hence, one could combine one submodular function which models diversity, another one which models coverage and use human supervision to learn a right model of a submodular function for the problem.
While submodular functions are fitting problems for summarization, they also admit very efficient algorithms for optimization. For example, a simple [[greedy algorithm]] admits a constant factor guarantee.<ref>Nemhauser, George L., Laurence A. Wolsey, and Marshall L. Fisher. "An analysis of approximations for maximizing submodular set functions—I." Mathematical Programming 14.1 (1978): 265-294.</ref> Moreover, the greedy algorithm is extremely simple to implement and can scale to large datasets, which is very important for summarization problems.
|