# American Institute of Mathematical Sciences

October  2016, 1(4): 309-340. doi: 10.3934/bdia.2016013

## A testbed to enable comparisons between competing approaches for computational social choice

 1 University of Waterloo & New College of Florida, 5800 Bayshore Road, Sarasota, FL 34234, USA 2 University of Waterloo, 200 University Avenue West, Waterloo, ON N2L 3G1, Canada

* Corresponding author: John A. Doucette

The authors acknowledge support from the Natural Sciences and Engineering Research Council of Canada, the Ontario Graduate Scholarship Program, and the University of Waterloo

Revised  April 2017 Published  April 2017

Within artificial intelligence, the field of computational social choice studies the application of AI techniques to the problem of group decision making, especially through systems where each agent submits a vote taking the form of a total ordering over the alternatives (a preference). Reaching a reasonable decision becomes more difficult when some agents are unwilling or unable to rank all the alternatives, and appropriate voting systems must be devised to handle the resulting incomplete preference information. In this paper, we present a detailed testbed which can be used to perform information analytics in this domain. We illustrate the testbed in action for the context of determining a winner or putting candidates into ranked order, using data from realworld elections, and demonstrate how to use the results of the testbed to produce effective comparisons between competing algorithms.

Citation: John A. Doucette, Robin Cohen. A testbed to enable comparisons between competing approaches for computational social choice. Big Data & Information Analytics, 2016, 1 (4) : 309-340. doi: 10.3934/bdia.2016013
##### References:
 [1] H. Azari, D. Parkes and L. Xia, Random Utility Theory for Social Choice, in Advances in Neural Information Processing Systems, NIPS Foundation, (2012), 126-134. [2] A. Balz and R. Senge, WEKA-LR: A Label Ranking Extension for WEKA, URL http://www.uni-marburg.de/fb12/kebi/research/software/labelrankdoc.pdf. [3] S. Bouveret, Whale3 -WHich ALternative is Elected, URL http://strokes.imag.fr/whale3/. [4] F. Brandt, G. Chabin and C. Geist, Pnyx: A Powerful and User-friendly Tool for Preference Aggregation, in Proceedings of the 2015 International Conference on Autonomous Agents and Multiagent Systems, International Foundation for Autonomous Agents and Multiagent Systems, (2015), 1915-1916. [5] C.-C. Chang and C.-J. Lin, LIBSVM: a library for support vector machines, ACM Transactions on Intelligent Systems and Technology (TIST), 2 (2011), 27-66.  doi: 10.1145/1961189.1961199. [6] G. Charwat and A. Pfandler, Democratix: A Declarative Approach to Winner Determination, in Algorithmic Decision Theory, Springer, (2015), 253-269.  doi: 10.1007/978-3-319-23114-3_16. [7] C. Cortes and V. Vapnik, Support-vector networks, Machine Learning, 20 (1995), 273-297.  doi: 10.1007/BF00994018. [8] J. Dean and S. Ghemawat, MapReduce: simplified data processing on large clusters, Communications of the ACM, 51 (2008), 107-113.  doi: 10.1145/1327452.1327492. [9] P. Diaconis and R.L. Graham, Spearman's footrule as a measure of disarray, Journal of the Royal Statistical Society. Series B (Methodological), (), 262-268. [10] J.P. Dickerson, A.D. Procaccia and T. Sandholm, Price of fairness in kidney exchange, in Proceedings of the 2014 international conference on Autonomous agents and multi-agent systems, International Foundation for Autonomous Agents and Multiagent Systems, (2014), 1013-1020. [11] E. Dimitriadou, K. Hornik, F. Leisch, D. Meyer and A. Weingessel, Misc functions of the Department of Statistics (e1071), TU Wien, R package, 1 (2008), 5-24. [12] J. A. Doucette, K. Larson and R. Cohen, Conventional machine learning for social choice, in Proceedings of the Twenty-Ninth AAAI Conference on Artificial Intelligence, AAAI Press, 2015. [13] G. H. Dunteman, Principal Components Analysis, no. 69 in Quantitative Applications in the Social Sciences, Sage, 1989. [14] V. E. Farrugia, H. P. Martínez and G. N. Yannakakis, The preference learning toolbox, arXiv preprint arXiv: 1506. 01709. [15] M. Gebser, B. Kaufmann, R. Kaminski, M. Ostrowski, T. Schaub and M. Schneider, Potassco: The Potsdam answer set solving collection, AI Communications, 24 (2011), 107-124. [16] M. Gelfond and V. Lifschitz, Classical negation in logic programs and disjunctive databases, New Generation Computing, 9 (1991), 365-385.  doi: 10.1007/BF03037169. [17] J. Goldman and A.D. Procaccia, Spliddit: Unleashing fair division algorithms, ACM SIGecom Exchanges, 13 (2015), 41-46.  doi: 10.1145/2728732.2728738. [18] M. Hall, E. Frank, G. Holmes, B. Pfahringer, P. Reutemann and I.H. Witten, The WEKA data mining software: an update, ACM SIGKDD explorations newsletter, 11 (2009), 10-18.  doi: 10.1145/1656274.1656278. [19] L. Hatton, The T-experiments: errors in scientific software, Quality of Numerical Software, (1997), 12-31.  doi: 10.1007/978-1-5041-2940-4_2. [20] L. Hatton and A. Roberts, How accurate is scientific software?, IEEE Transactions on Software Engineering, 20 (1994), 785-797.  doi: 10.1109/32.328993. [21] M. R. Hestenes and E. Stiefel, Methods of Conjugate Gradients for Solving Linear Systems 1 Journal of Research of the National Bureau of Standards, 49. [22] E. Hüllermeier and J. Fürnkranz, Learning from label preferences, Proceedings of the 14th International Conference on Discovery Science, (2011), 2-17. [23] T. Joachims, Optimizing search engines using clickthrough data, Proceedings of the Eighth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, (2002), 133-142.  doi: 10.1145/775047.775067. [24] T. Joachims, Training linear SVMs in linear time, Proceedings of the 12th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, (2006), 217-226.  doi: 10.1145/1150402.1150429. [25] D. Kelly and R. Sanders, The challenge of testing scientific software, In Proceedings of the 2008 Conference for the Association for Software Testing, (2008), 30-36. [26] M.G. Kendall, A new measure of rank correlation, Biometrika, 30 (1938), 81-93.  doi: 10.1093/biomet/49.1-2.133. [27] S. Kullback and R.A. Leibler, On information and sufficiency, The Annals of Mathematical Statistics, 22 (1951), 79-86.  doi: 10.1214/aoms/1177729694. [28] T. Lu and C. Boutilier, Learning Mallows models with pairwise preferences, Proceedings of the 28th International Conference on Machine Learning (ICML-11), (2011), 145-152. [29] T. Lu and C. Boutilier, Robust approximation and incremental elicitation in voting protocols, Proceedings of the Twenty-First International Joint Conference on Artificial Intelligence, (2011), 287-213. [30] R. D. Luce, Individual Choice Behavior: A Theoretical Analysis, Wiley, 1959. [31] C.L. Mallows, Non-null ranking models. I, Biometrika, 44 (1957), 114-130.  doi: 10.1093/biomet/44.1-2.114. [32] N. Mattei and T. Walsh, Preflib: A library for preferences http://www.preflib.org, in Algorithmic Decision Theory, Springer, 2013, 259-270. [33] N. Matti, PrefLib-Tools: A small and lightweight set of Python tools for working with and generating data from www. PrefLib. org. , URL https://github.com/nmattei/PrefLib-Tools. [34] R. Rifkin and A. Klautau, In defense of one-vs-all classification, The Journal of Machine Learning Research, 5 (2004), 101-141. [35] B. Ripley and W. Venables, nnet: Feed-forward neural networks and multinomial log-linear models, R package version, 7. [36] P. Romanski, FSelector: Selecting attributes, Vienna: R Foundation for Statistical Computing. [37] C. Spearman, 'Footrule' for measuring correlation, British Journal of Psychology, 1904-1920, (1906), 89-108.  doi: 10.1111/j.2044-8295.1906.tb00174.x. [38] J.E. Stone, D. Gohara and G. Shi, OpenCL: A parallel programming standard for heterogeneous computing systems, Computing in Science & Engineering, 12 (2010), 66-73.  doi: 10.1109/MCSE.2010.69. [39] R. C. Team, R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria. 2013, ISBN 3-900051-07-0, 2014 [40] L. Xia and V. Conitzer, Determining Possible and Necessary Winners under Common Voting Rules Given Partial Orders., Proceedings of the Twenty-Third AAAI Conference on Artificial Intelligence, (2008), 196-201.

show all references

The authors acknowledge support from the Natural Sciences and Engineering Research Council of Canada, the Ontario Graduate Scholarship Program, and the University of Waterloo

##### References:
 [1] H. Azari, D. Parkes and L. Xia, Random Utility Theory for Social Choice, in Advances in Neural Information Processing Systems, NIPS Foundation, (2012), 126-134. [2] A. Balz and R. Senge, WEKA-LR: A Label Ranking Extension for WEKA, URL http://www.uni-marburg.de/fb12/kebi/research/software/labelrankdoc.pdf. [3] S. Bouveret, Whale3 -WHich ALternative is Elected, URL http://strokes.imag.fr/whale3/. [4] F. Brandt, G. Chabin and C. Geist, Pnyx: A Powerful and User-friendly Tool for Preference Aggregation, in Proceedings of the 2015 International Conference on Autonomous Agents and Multiagent Systems, International Foundation for Autonomous Agents and Multiagent Systems, (2015), 1915-1916. [5] C.-C. Chang and C.-J. Lin, LIBSVM: a library for support vector machines, ACM Transactions on Intelligent Systems and Technology (TIST), 2 (2011), 27-66.  doi: 10.1145/1961189.1961199. [6] G. Charwat and A. Pfandler, Democratix: A Declarative Approach to Winner Determination, in Algorithmic Decision Theory, Springer, (2015), 253-269.  doi: 10.1007/978-3-319-23114-3_16. [7] C. Cortes and V. Vapnik, Support-vector networks, Machine Learning, 20 (1995), 273-297.  doi: 10.1007/BF00994018. [8] J. Dean and S. Ghemawat, MapReduce: simplified data processing on large clusters, Communications of the ACM, 51 (2008), 107-113.  doi: 10.1145/1327452.1327492. [9] P. Diaconis and R.L. Graham, Spearman's footrule as a measure of disarray, Journal of the Royal Statistical Society. Series B (Methodological), (), 262-268. [10] J.P. Dickerson, A.D. Procaccia and T. Sandholm, Price of fairness in kidney exchange, in Proceedings of the 2014 international conference on Autonomous agents and multi-agent systems, International Foundation for Autonomous Agents and Multiagent Systems, (2014), 1013-1020. [11] E. Dimitriadou, K. Hornik, F. Leisch, D. Meyer and A. Weingessel, Misc functions of the Department of Statistics (e1071), TU Wien, R package, 1 (2008), 5-24. [12] J. A. Doucette, K. Larson and R. Cohen, Conventional machine learning for social choice, in Proceedings of the Twenty-Ninth AAAI Conference on Artificial Intelligence, AAAI Press, 2015. [13] G. H. Dunteman, Principal Components Analysis, no. 69 in Quantitative Applications in the Social Sciences, Sage, 1989. [14] V. E. Farrugia, H. P. Martínez and G. N. Yannakakis, The preference learning toolbox, arXiv preprint arXiv: 1506. 01709. [15] M. Gebser, B. Kaufmann, R. Kaminski, M. Ostrowski, T. Schaub and M. Schneider, Potassco: The Potsdam answer set solving collection, AI Communications, 24 (2011), 107-124. [16] M. Gelfond and V. Lifschitz, Classical negation in logic programs and disjunctive databases, New Generation Computing, 9 (1991), 365-385.  doi: 10.1007/BF03037169. [17] J. Goldman and A.D. Procaccia, Spliddit: Unleashing fair division algorithms, ACM SIGecom Exchanges, 13 (2015), 41-46.  doi: 10.1145/2728732.2728738. [18] M. Hall, E. Frank, G. Holmes, B. Pfahringer, P. Reutemann and I.H. Witten, The WEKA data mining software: an update, ACM SIGKDD explorations newsletter, 11 (2009), 10-18.  doi: 10.1145/1656274.1656278. [19] L. Hatton, The T-experiments: errors in scientific software, Quality of Numerical Software, (1997), 12-31.  doi: 10.1007/978-1-5041-2940-4_2. [20] L. Hatton and A. Roberts, How accurate is scientific software?, IEEE Transactions on Software Engineering, 20 (1994), 785-797.  doi: 10.1109/32.328993. [21] M. R. Hestenes and E. Stiefel, Methods of Conjugate Gradients for Solving Linear Systems 1 Journal of Research of the National Bureau of Standards, 49. [22] E. Hüllermeier and J. Fürnkranz, Learning from label preferences, Proceedings of the 14th International Conference on Discovery Science, (2011), 2-17. [23] T. Joachims, Optimizing search engines using clickthrough data, Proceedings of the Eighth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, (2002), 133-142.  doi: 10.1145/775047.775067. [24] T. Joachims, Training linear SVMs in linear time, Proceedings of the 12th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, (2006), 217-226.  doi: 10.1145/1150402.1150429. [25] D. Kelly and R. Sanders, The challenge of testing scientific software, In Proceedings of the 2008 Conference for the Association for Software Testing, (2008), 30-36. [26] M.G. Kendall, A new measure of rank correlation, Biometrika, 30 (1938), 81-93.  doi: 10.1093/biomet/49.1-2.133. [27] S. Kullback and R.A. Leibler, On information and sufficiency, The Annals of Mathematical Statistics, 22 (1951), 79-86.  doi: 10.1214/aoms/1177729694. [28] T. Lu and C. Boutilier, Learning Mallows models with pairwise preferences, Proceedings of the 28th International Conference on Machine Learning (ICML-11), (2011), 145-152. [29] T. Lu and C. Boutilier, Robust approximation and incremental elicitation in voting protocols, Proceedings of the Twenty-First International Joint Conference on Artificial Intelligence, (2011), 287-213. [30] R. D. Luce, Individual Choice Behavior: A Theoretical Analysis, Wiley, 1959. [31] C.L. Mallows, Non-null ranking models. I, Biometrika, 44 (1957), 114-130.  doi: 10.1093/biomet/44.1-2.114. [32] N. Mattei and T. Walsh, Preflib: A library for preferences http://www.preflib.org, in Algorithmic Decision Theory, Springer, 2013, 259-270. [33] N. Matti, PrefLib-Tools: A small and lightweight set of Python tools for working with and generating data from www. PrefLib. org. , URL https://github.com/nmattei/PrefLib-Tools. [34] R. Rifkin and A. Klautau, In defense of one-vs-all classification, The Journal of Machine Learning Research, 5 (2004), 101-141. [35] B. Ripley and W. Venables, nnet: Feed-forward neural networks and multinomial log-linear models, R package version, 7. [36] P. Romanski, FSelector: Selecting attributes, Vienna: R Foundation for Statistical Computing. [37] C. Spearman, 'Footrule' for measuring correlation, British Journal of Psychology, 1904-1920, (1906), 89-108.  doi: 10.1111/j.2044-8295.1906.tb00174.x. [38] J.E. Stone, D. Gohara and G. Shi, OpenCL: A parallel programming standard for heterogeneous computing systems, Computing in Science & Engineering, 12 (2010), 66-73.  doi: 10.1109/MCSE.2010.69. [39] R. C. Team, R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria. 2013, ISBN 3-900051-07-0, 2014 [40] L. Xia and V. Conitzer, Determining Possible and Necessary Winners under Common Voting Rules Given Partial Orders., Proceedings of the Twenty-Third AAAI Conference on Artificial Intelligence, (2008), 196-201.
A graphical depiction of the data flow in Prefmine, with example input arguments. Rounded boxes denote algorithms, presented in full in the text. Cylinders show external data stores. Sharp squares denote immutable internal data stores, output by one of the system's algorithms. Ovals show the system's final output. Processing begins with the leftmost algorithm (the Dataset Loader), which downloads data from the Preflib repository, and then formats it as an immutable database. The immutable datasets are then passed to the middle algorithm (The Experimental Loop), which runs experiments on them according to its other input parameters, producing an immutable results database. The rightmost algorithm (The Analysis Toolkit) can be used to generate human readable and Latex-formatted tables by querying the results database. Queries are constructed from the input parameters. Existing valid parameter settings are discussed later in the text, but most parameter sets are easily extensible.
The initial Prefmine window, from which users can launch a new experiment, or run the analysis toolkit
The experiment configuration window in Prefmine, from which the user can configure input parameters for both the Dataset Loader and Experimental Loop portions of the system
An example of an experiment configured using Prefmine's experiment configuration window. The experiment will run over the seven Debian datasets from the Preflib repository. Logistic Regression and a Worst Case Imputation approach will be applied to each problem instance, and the Copeland voting rule will be used to decide outcomes. Performance will be assessed under the Single Winner Error, First Error Location, and Kendall Correlation (τ) performance measures. 5 replications will be performed, and the output database will be stored in /tmp/, overwriting any existing results there. Since more than one dataset is being processed, the output filename parameter is ignored. The parameters below the output textbox ("Waiting...") are used to configure synthetic data generators or imputation methods that are not used, and so are ignored. Pressing the "Create" button will start the experiment
An example of a Prefmine experiment in progress, using the settings from Figure 4. Note the progress bars showing the fraction of datsets processed (top) and runs completed on this dataset (bottom). The Textbox summarizes progress, including runtimes required to complete each dataset
The Prefmine analysis window. The window is similar to the experiment window, but with a reduced set of options, and larger space to view the output
The location of the dataset selection dropdown menu in Prefmine's experiment window
The Imputation Method selection box in Prefmine's experiment window
The Voting Rules selection box in Prefmine's experiment window
The Performance Metric selection box in Prefmine's experiment window
Empirical cumulative density functions for unique ballots in Dublin North and Dublin West. The x-axis shows the ranking of ballots from most to least common. The y-axis shows the cumulative proportion of voters who cast each ballot
 Algorithm 1 An algorithmic description of the Dataset Loader component of the Prefmine system. The Dataset Loader accepts a string corresponding to the name of a dataset, and then downloads the corresponding.soi file from the Preflib repository [32]. The file is then processed into an immutable Data instance which is produced as output. procedure DatasetLoader(DatasetName) Let $\text{url} \leftarrow$ lookupURL(DatasetName) Let $S$ be a stream obtained by opening $\text{url}$. Let  $|C| \leftarrow$ S.nextLine() Let CandMap = $\emptyset$ for i = 0; $i < |C|$; i++ do Let {key, name} = split(S.nextLine(), ", ") Let CandMap[key] = name end for // The last line of the metadata contains the number of ballots. Let $|B| \leftarrow$ split(S.nextLine(), ", ")[0] Dataset output = $\emptyset$ While S.hasNextLine () do //Each remaining line is parsed into a top order. tokens $\leftarrow$ split(S.nextLine(), ", ") numBallots $\leftarrow$ tokens[0] Datum d = $\emptyset$ notAssigned = CandMap.keys() for i=1; $i <$ |tokens|; i++ do d[i] = tokens[i] notAssigned $\setminus =$ tokens[i] end for for i= —tokens—; i¡ —C—; i++ do d[i] = notAssigned end for for i = 0; i¡ numBallots; i++ do output $\leftarrow$ append(output, d.duplicate()) end for end while return cast(Immutable Dataset) output end procedure
 Algorithm 1 An algorithmic description of the Dataset Loader component of the Prefmine system. The Dataset Loader accepts a string corresponding to the name of a dataset, and then downloads the corresponding.soi file from the Preflib repository [32]. The file is then processed into an immutable Data instance which is produced as output. procedure DatasetLoader(DatasetName) Let $\text{url} \leftarrow$ lookupURL(DatasetName) Let $S$ be a stream obtained by opening $\text{url}$. Let  $|C| \leftarrow$ S.nextLine() Let CandMap = $\emptyset$ for i = 0; $i < |C|$; i++ do Let {key, name} = split(S.nextLine(), ", ") Let CandMap[key] = name end for // The last line of the metadata contains the number of ballots. Let $|B| \leftarrow$ split(S.nextLine(), ", ")[0] Dataset output = $\emptyset$ While S.hasNextLine () do //Each remaining line is parsed into a top order. tokens $\leftarrow$ split(S.nextLine(), ", ") numBallots $\leftarrow$ tokens[0] Datum d = $\emptyset$ notAssigned = CandMap.keys() for i=1; $i <$ |tokens|; i++ do d[i] = tokens[i] notAssigned $\setminus =$ tokens[i] end for for i= —tokens—; i¡ —C—; i++ do d[i] = notAssigned end for for i = 0; i¡ numBallots; i++ do output $\leftarrow$ append(output, d.duplicate()) end for end while return cast(Immutable Dataset) output end procedure
 Algorithm 2 An algorithmic description of the Experimental Loop component of the Prefmine system. The Experimental Loop accepts an Immutable Dataset (produced using Algorithm 1), an ablation mode, a list of social choice algorithms (e.g. imputation-based, Minimax Regret), a list of voting rules (e.g. Borda, Copeland, a list of performance measures (e.g. Single Winner Error, First Error Location), and a number of repetitions for the experiment. For each repetition of the experiment, a new problem instance is generated using the specified ablation mode. Then every social choice algorithm is run under every voting rule to produce an outcome. The outcome is compared to the ground truth outcome for the voting rule under consideration using every performance measure. The results are written to an output database, which is rendered read-only as the final step procedure ExperimentalLoop(ImmutableDatasets, AblationMode, ListOfSCAlgs, ListOfVotingRules, ListOfPrefMeasures, NumReps) Let Output = $\emptyset$ for Rep = 0; Rep ¡ NumReps; Rep++ do for all Dataset in ImmutableDatasets do Let Problem = AblationMode(Dataset.copy()) for all Alg in ListOfSCAlgs do for all Rule in ListOfVotingRules do if Alg uses Imputation      then Outcome = Rule(Alg(Problem.expressedPrefs.copy())) else Outcome = Alg(Problem.expressedPrefs.copy(), Rule) end if CorrectOutcome = Rule(Problem.truePrefs) for all Measure in ListOfPerfMeasures do Key = concatenateNames(Dataset, Alg, Rule, Measure, Rep) Output[Key] = Measure(Outcome, CorrectOutcome) end for end for end for end for end for return cast(Immutable) Output end procedure
 Algorithm 2 An algorithmic description of the Experimental Loop component of the Prefmine system. The Experimental Loop accepts an Immutable Dataset (produced using Algorithm 1), an ablation mode, a list of social choice algorithms (e.g. imputation-based, Minimax Regret), a list of voting rules (e.g. Borda, Copeland, a list of performance measures (e.g. Single Winner Error, First Error Location), and a number of repetitions for the experiment. For each repetition of the experiment, a new problem instance is generated using the specified ablation mode. Then every social choice algorithm is run under every voting rule to produce an outcome. The outcome is compared to the ground truth outcome for the voting rule under consideration using every performance measure. The results are written to an output database, which is rendered read-only as the final step procedure ExperimentalLoop(ImmutableDatasets, AblationMode, ListOfSCAlgs, ListOfVotingRules, ListOfPrefMeasures, NumReps) Let Output = $\emptyset$ for Rep = 0; Rep ¡ NumReps; Rep++ do for all Dataset in ImmutableDatasets do Let Problem = AblationMode(Dataset.copy()) for all Alg in ListOfSCAlgs do for all Rule in ListOfVotingRules do if Alg uses Imputation      then Outcome = Rule(Alg(Problem.expressedPrefs.copy())) else Outcome = Alg(Problem.expressedPrefs.copy(), Rule) end if CorrectOutcome = Rule(Problem.truePrefs) for all Measure in ListOfPerfMeasures do Key = concatenateNames(Dataset, Alg, Rule, Measure, Rep) Output[Key] = Measure(Outcome, CorrectOutcome) end for end for end for end for end for return cast(Immutable) Output end procedure
 Algorithm 3 An algorithmic description of the Analysis Toolkit component of the Prefmine system. The Analysis Toolkit accepts an output database produced by the Experimental Loop component of the system (Algorithm 2). The user also specifies a particular social choice algorithm and voting rule, as well as a list of datasets and performance measures. The toolkit computes a table where each row corresponds to a dataset, and each column to a performance measure. The value in a particular table cell will be the mean and standard deviation of the selected social choice algorithm under the selected voting rule, with respect to the corresponding performance measure (i.e. column) and dataset (i.e. row). The resulting table is then output both in a human readable format and as a Latex tabular environment. procedure AnalysisToolkit(OutputDatabase, SCAlg, VotingRule, ListOfPrefMeasures, ListOfDatasets) Let Table = $\emptyset$ for all Dataset in ListOfDatasets do for all Measure in ListOfPerfMeasures do Key = concatenateNames(Dataset, SCAlg, VotingRule, Measure, *) Results = OutputDatabase[key] Mean = computeMean(Results) Stdev = computerStandardDeviation(Results) Table[Dataset][Measure][\mean"] = Mean Table[Dataset][Measure][\stdev"] = Stdev end for end for Print(Table) PrintLatex(Table) end procedure
 Algorithm 3 An algorithmic description of the Analysis Toolkit component of the Prefmine system. The Analysis Toolkit accepts an output database produced by the Experimental Loop component of the system (Algorithm 2). The user also specifies a particular social choice algorithm and voting rule, as well as a list of datasets and performance measures. The toolkit computes a table where each row corresponds to a dataset, and each column to a performance measure. The value in a particular table cell will be the mean and standard deviation of the selected social choice algorithm under the selected voting rule, with respect to the corresponding performance measure (i.e. column) and dataset (i.e. row). The resulting table is then output both in a human readable format and as a Latex tabular environment. procedure AnalysisToolkit(OutputDatabase, SCAlg, VotingRule, ListOfPrefMeasures, ListOfDatasets) Let Table = $\emptyset$ for all Dataset in ListOfDatasets do for all Measure in ListOfPerfMeasures do Key = concatenateNames(Dataset, SCAlg, VotingRule, Measure, *) Results = OutputDatabase[key] Mean = computeMean(Results) Stdev = computerStandardDeviation(Results) Table[Dataset][Measure][\mean"] = Mean Table[Dataset][Measure][\stdev"] = Stdev end for end for Print(Table) PrintLatex(Table) end procedure
Table showing the mean firstError, singleWinner, and tau measures for the instantiation of the imputation-based approach using logistic regression on the Copeland social choice function. Reported values are the mean over many problem instances, and reported measurement errors are the sample standard deviations. The rightmost columns report the number of candidates, and the percentage of missing information in each set
 First Error Single Winner $\tau$ —C— % Missing Debian 2002 $4.00 \pm 0.00$ $0.00 \pm 0.00$ $1.00 \pm 0.00$ 4 11.9 Debian 2003 $4.94 \pm 0.49$ $0.01 \pm 0.12$ $0.99 \pm 0.07$ 5 13.6 Debian 2005 $7.00 \pm 0.00$ $0.00 \pm 0.00$ $1.00 \pm 0.00$ 7 15.5 Debian 2006 $6.35 \pm 0.76$ $0.00 \pm 0.00$ $0.94 \pm 0.03$ 8 14.8 Debian 2007 $9.00 \pm 0.00$ $0.00 \pm 0.00$ $0.90 \pm 0.08$ 9 19.1 Debian 2010 $5.00 \pm 0.00$ $0.00 \pm 0.00$ $1.00 \pm 0.00$ 5 11.0 Debian 2012 $4.00 \pm 0.00$ $0.00 \pm 0.00$ $1.00 \pm 0.00$ 4 13.2 Debian Logo $5.18 \pm 1.36$ $0.00 \pm 0.00$ $0.75 \pm 0.11$ 8 40.0 Dublin North $4.01 \pm 0.07$ $0.00 \pm 0.00$ $0.85 \pm 0.01$ 12 58.5 Dublin West $3.85 \pm 1.88$ $0.82 \pm 0.38$ $0.81 \pm 0.06$ 9 50.8 Meath $1.00 \pm 0.00$ $3.54 \pm 0.32$ $0.74 \pm 0.02$ 14 66.8
 First Error Single Winner $\tau$ —C— % Missing Debian 2002 $4.00 \pm 0.00$ $0.00 \pm 0.00$ $1.00 \pm 0.00$ 4 11.9 Debian 2003 $4.94 \pm 0.49$ $0.01 \pm 0.12$ $0.99 \pm 0.07$ 5 13.6 Debian 2005 $7.00 \pm 0.00$ $0.00 \pm 0.00$ $1.00 \pm 0.00$ 7 15.5 Debian 2006 $6.35 \pm 0.76$ $0.00 \pm 0.00$ $0.94 \pm 0.03$ 8 14.8 Debian 2007 $9.00 \pm 0.00$ $0.00 \pm 0.00$ $0.90 \pm 0.08$ 9 19.1 Debian 2010 $5.00 \pm 0.00$ $0.00 \pm 0.00$ $1.00 \pm 0.00$ 5 11.0 Debian 2012 $4.00 \pm 0.00$ $0.00 \pm 0.00$ $1.00 \pm 0.00$ 4 13.2 Debian Logo $5.18 \pm 1.36$ $0.00 \pm 0.00$ $0.75 \pm 0.11$ 8 40.0 Dublin North $4.01 \pm 0.07$ $0.00 \pm 0.00$ $0.85 \pm 0.01$ 12 58.5 Dublin West $3.85 \pm 1.88$ $0.82 \pm 0.38$ $0.81 \pm 0.06$ 9 50.8 Meath $1.00 \pm 0.00$ $3.54 \pm 0.32$ $0.74 \pm 0.02$ 14 66.8
Table showing the mean firstError, singleWinner, and tau measures for the Minimax Regret approach on the Copeland social choice function. Reported values are the mean over many problem instances, and reported measurement errors are the sample standard deviations. The rightmost columns report the number of candidates, and the percentage of missing information in each set
 First Error Single Winner $\tau$ —C— % Missing Debian 2002 $4.00 \pm 0.00$ $0.00 \pm 0.00$ $1.00 \pm 0.00$ 4 11.9 Debian 2003 $4.40 \pm 0.84$ $0.02 \pm 0.09$ $0.90 \pm 0.11$ 5 13.6 Debian 2005 $6.57 \pm 1.54$ $0.04 \pm 0.13$ $0.99 \pm 0.02$ 7 15.5 Debian 2006 $6.00 \pm 0.10$ $0.00 \pm 0.00$ $0.93 \pm 0.00$ 8 14.8 Debian 2007 $2.05 \pm 0.37$ $0.00 \pm 0.00$ $0.87 \pm 0.03$ 9 19.1 Debian 2010 $3.04 \pm 1.39$ $0.00 \pm 0.00$ $0.82 \pm 0.14$ 5 11.0 Debian 2012 $4.00 \pm 0.00$ $0.00 \pm 0.00$ $1.00 \pm 0.00$ 4 13.2 Debian Logo $5.44 \pm 1.00$ $0.00 \pm 0.00$ $0.74 \pm 0.12$ 8 40.0 Dublin North $3.98 \pm 0.14$ $0.00 \pm 0.00$ $-0.09 \pm 0.03$ 12 58.5 Dublin West $3.00 \pm 0.00$ $0.72 \pm 0.13$ $0.52 \pm 0.05$ 9 50.8 Meath $2.97 \pm 0.17$ $0.00 \pm 0.00$ $-0.46 \pm 0.04$ 14 66.8
 First Error Single Winner $\tau$ —C— % Missing Debian 2002 $4.00 \pm 0.00$ $0.00 \pm 0.00$ $1.00 \pm 0.00$ 4 11.9 Debian 2003 $4.40 \pm 0.84$ $0.02 \pm 0.09$ $0.90 \pm 0.11$ 5 13.6 Debian 2005 $6.57 \pm 1.54$ $0.04 \pm 0.13$ $0.99 \pm 0.02$ 7 15.5 Debian 2006 $6.00 \pm 0.10$ $0.00 \pm 0.00$ $0.93 \pm 0.00$ 8 14.8 Debian 2007 $2.05 \pm 0.37$ $0.00 \pm 0.00$ $0.87 \pm 0.03$ 9 19.1 Debian 2010 $3.04 \pm 1.39$ $0.00 \pm 0.00$ $0.82 \pm 0.14$ 5 11.0 Debian 2012 $4.00 \pm 0.00$ $0.00 \pm 0.00$ $1.00 \pm 0.00$ 4 13.2 Debian Logo $5.44 \pm 1.00$ $0.00 \pm 0.00$ $0.74 \pm 0.12$ 8 40.0 Dublin North $3.98 \pm 0.14$ $0.00 \pm 0.00$ $-0.09 \pm 0.03$ 12 58.5 Dublin West $3.00 \pm 0.00$ $0.72 \pm 0.13$ $0.52 \pm 0.05$ 9 50.8 Meath $2.97 \pm 0.17$ $0.00 \pm 0.00$ $-0.46 \pm 0.04$ 14 66.8
A summary of the results from the preliminary experiment comparing feature selection methods and classifiers for use with the imputation based approach to resolving social choice with incomplete information. Performance is reported under the Borda Error measure, which is related to the classifier's accuracy in imputing ballots. Performance which is statistically indistinguishable from the best on any given dataset has been rendered in bold
 SVM NB MN PCA IG plain PCA IG plain PCA IG plain Deb. 2002 0.008 0.007 0.008 0.003 0.003 0.003 0.003 0.003 0.003 Deb. 2003 0.009 0.005 0.009 0.009 0.009 0.009 0.009 0.009 0.009 Deb. 2005 0.014 0.013 0.016 0.031 0.031 0.031 0.031 0.031 0.031 Deb. 2006 0.013 0.013 0.015 0.031 0.031 0.031 0.031 0.031 0.031 Deb. 2007 0.033 0.031 0.033 0.043 0.043 0.043 0.043 0.043 0.043 Deb. 2010 0.004 0.004 0.004 0.005 0.005 0.005 0.005 0.005 0.005 Deb. 2012 0.011 0.011 0.011 0.003 0.003 0.003 0.003 0.003 0.003 Deb. Logo 0.006 0.008 0.006 0.011 0.011 0.011 0.011 0.011 0.011 North 1.084 0.923 1.08 1.546 1.546 1.546 1.546 1.546 1.546 West 0.549 0.458 0.575 0.654 0.654 0.654 0.654 0.654 0.654
 SVM NB MN PCA IG plain PCA IG plain PCA IG plain Deb. 2002 0.008 0.007 0.008 0.003 0.003 0.003 0.003 0.003 0.003 Deb. 2003 0.009 0.005 0.009 0.009 0.009 0.009 0.009 0.009 0.009 Deb. 2005 0.014 0.013 0.016 0.031 0.031 0.031 0.031 0.031 0.031 Deb. 2006 0.013 0.013 0.015 0.031 0.031 0.031 0.031 0.031 0.031 Deb. 2007 0.033 0.031 0.033 0.043 0.043 0.043 0.043 0.043 0.043 Deb. 2010 0.004 0.004 0.004 0.005 0.005 0.005 0.005 0.005 0.005 Deb. 2012 0.011 0.011 0.011 0.003 0.003 0.003 0.003 0.003 0.003 Deb. Logo 0.006 0.008 0.006 0.011 0.011 0.011 0.011 0.011 0.011 North 1.084 0.923 1.08 1.546 1.546 1.546 1.546 1.546 1.546 West 0.549 0.458 0.575 0.654 0.654 0.654 0.654 0.654 0.654
A summary of the results from the preliminary experiment comparing feature selection methods and classifiers for use with the imputation based approach to resolving social choice with incomplete information. Performance is reported under the Bias measure, which is the correlation between the classifier's error in imputing a given candidate and the popularity of that candidate. Performance which is statistically indistinguishable from the best on any given dataset has been rendered in bold
 SVM NB MN PCA IG plain PCA IG plain PCA IG plain Deb. 2002 0.151 0.155 0.159 0.109 0.109 0.109 0.109 0.109 0.109 Deb. 2003 0.088 0.030 0.051 0.009 0.009 0.009 0.009 0.009 0.009 Deb. 2005 0.255 0.182 0.309 0.031 0.031 0.031 0.031 0.031 0.031 Deb. 2006 0.171 0.040 0.203 0.011 0.011 0.011 0.011 0.011 0.011 Deb. 2007 0.280 0.170 0.296 0.065 0.065 0.065 0.065 0.065 0.065 Deb. 2010 0.078 0.162 0.064 0.099 0.099 0.099 0.099 0.099 0.099 Deb. 2012 0.34 0.35 0.32 0.12 0.12 0.12 0.12 0.12 0.12 Deb. Logo 0.056 0.034 0.026 0.007 0.007 0.007 0.007 0.007 0.007 North 0.331 0.323 0.322 0.391 0.391 0.391 0.391 0.391 0.391 West 0.020 0.101 0.026 0.041 0.041 0.041 0.041 0.041 0.041
 SVM NB MN PCA IG plain PCA IG plain PCA IG plain Deb. 2002 0.151 0.155 0.159 0.109 0.109 0.109 0.109 0.109 0.109 Deb. 2003 0.088 0.030 0.051 0.009 0.009 0.009 0.009 0.009 0.009 Deb. 2005 0.255 0.182 0.309 0.031 0.031 0.031 0.031 0.031 0.031 Deb. 2006 0.171 0.040 0.203 0.011 0.011 0.011 0.011 0.011 0.011 Deb. 2007 0.280 0.170 0.296 0.065 0.065 0.065 0.065 0.065 0.065 Deb. 2010 0.078 0.162 0.064 0.099 0.099 0.099 0.099 0.099 0.099 Deb. 2012 0.34 0.35 0.32 0.12 0.12 0.12 0.12 0.12 0.12 Deb. Logo 0.056 0.034 0.026 0.007 0.007 0.007 0.007 0.007 0.007 North 0.331 0.323 0.322 0.391 0.391 0.391 0.391 0.391 0.391 West 0.020 0.101 0.026 0.041 0.041 0.041 0.041 0.041 0.041
 [1] Yang Yu. Introduction: Special issue on computational intelligence methods for big data and information analytics. Big Data & Information Analytics, 2017, 2 (1) : i-ii. doi: 10.3934/bdia.201701i [2] Vikram Krishnamurthy, William Hoiles. Information diffusion in social sensing. Numerical Algebra, Control and Optimization, 2016, 6 (3) : 365-411. doi: 10.3934/naco.2016017 [3] Lea Ellwardt, Penélope Hernández, Guillem Martínez-Cánovas, Manuel Muñoz-Herrera. Conflict and segregation in networks: An experiment on the interplay between individual preferences and social influence. Journal of Dynamics and Games, 2016, 3 (2) : 191-216. doi: 10.3934/jdg.2016010 [4] C. Bonanno, G. Menconi. Computational information for the logistic map at the chaos threshold. Discrete and Continuous Dynamical Systems - B, 2002, 2 (3) : 415-431. doi: 10.3934/dcdsb.2002.2.415 [5] Émilie Chouzenoux, Henri Gérard, Jean-Christophe Pesquet. General risk measures for robust machine learning. Foundations of Data Science, 2019, 1 (3) : 249-269. doi: 10.3934/fods.2019011 [6] Ana Rita Nogueira, João Gama, Carlos Abreu Ferreira. Causal discovery in machine learning: Theories and applications. Journal of Dynamics and Games, 2021, 8 (3) : 203-231. doi: 10.3934/jdg.2021008 [7] Liqiang Jin, Yanqing Liu, Yanyan Yin, Kok Lay Teo, Fei Liu. Design of probabilistic $l_2-l_\infty$ filter for uncertain Markov jump systems with partial information of the transition probabilities. Journal of Industrial and Management Optimization, 2021  doi: 10.3934/jimo.2021070 [8] Weiping Li, Haiyan Wu, Jie Yang. Intelligent recognition algorithm for social network sensitive information based on classification technology. Discrete and Continuous Dynamical Systems - S, 2019, 12 (4&5) : 1385-1398. doi: 10.3934/dcdss.2019095 [9] Jingli Ren, Dandan Zhu, Haiyan Wang. Spreading-vanishing dichotomy in information diffusion in online social networks with intervention. Discrete and Continuous Dynamical Systems - B, 2019, 24 (4) : 1843-1865. doi: 10.3934/dcdsb.2018240 [10] Xiaoshuang Xing, Gaofei Sun, Yong Jin, Wenyi Tang, Xiuzhen Cheng. Relay selection based on social relationship prediction and information leakage reduction for mobile social networks. Mathematical Foundations of Computing, 2018, 1 (4) : 369-382. doi: 10.3934/mfc.2018018 [11] Azam Chaudhry, Rehana Naz. Closed-form solutions for the Lucas-Uzawa growth model with logarithmic utility preferences via the partial Hamiltonian approach. Discrete and Continuous Dynamical Systems - S, 2018, 11 (4) : 643-654. doi: 10.3934/dcdss.2018039 [12] Guowei Dai, Ruyun Ma, Haiyan Wang, Feng Wang, Kuai Xu. Partial differential equations with Robin boundary condition in online social networks. Discrete and Continuous Dynamical Systems - B, 2015, 20 (6) : 1609-1624. doi: 10.3934/dcdsb.2015.20.1609 [13] Shuaiqi Zhang, Jie Xiong, Xin Zhang. Optimal investment problem with delay under partial information. Mathematical Control and Related Fields, 2020, 10 (2) : 365-378. doi: 10.3934/mcrf.2020001 [14] Huimin Zhang, Jing Zhao, Fujun Hou. How to share partial information with competitive manufacturers. Journal of Industrial and Management Optimization, 2022  doi: 10.3934/jimo.2022010 [15] Jiang Xie, Junfu Xu, Celine Nie, Qing Nie. Machine learning of swimming data via wisdom of crowd and regression analysis. Mathematical Biosciences & Engineering, 2017, 14 (2) : 511-527. doi: 10.3934/mbe.2017031 [16] Mingbao Cheng, Shuxian Xiao, Guosheng Liu. Single-machine rescheduling problems with learning effect under disruptions. Journal of Industrial and Management Optimization, 2018, 14 (3) : 967-980. doi: 10.3934/jimo.2017085 [17] Andreas Chirstmann, Qiang Wu, Ding-Xuan Zhou. Preface to the special issue on analysis in machine learning and data science. Communications on Pure and Applied Analysis, 2020, 19 (8) : i-iii. doi: 10.3934/cpaa.2020171 [18] Vieri Benci, C. Bonanno, Stefano Galatolo, G. Menconi, M. Virgilio. Dynamical systems and computable information. Discrete and Continuous Dynamical Systems - B, 2004, 4 (4) : 935-960. doi: 10.3934/dcdsb.2004.4.935 [19] Andrea L. Bertozzi. Preface to special issue on mathematics of social systems. Discrete and Continuous Dynamical Systems - B, 2014, 19 (5) : i-v. doi: 10.3934/dcdsb.2014.19.5i [20] Viktor Levandovskyy, Gerhard Pfister, Valery G. Romanovski. Evaluating cyclicity of cubic systems with algorithms of computational algebra. Communications on Pure and Applied Analysis, 2012, 11 (5) : 2023-2035. doi: 10.3934/cpaa.2012.11.2023

Impact Factor: