I design and explore computer models which simulate networks of scientists who generate random samples to answer particular scientific questions. In doing so, I reveal unacknowledged and unappreciated features that are inherent to science on a fundamental level, which in turn informs our understanding of more complicated situations. In addition, I critically analyze the limitations of such techniques and their ability (as well as inability) to reveal different aspects of the situations at hand.
More specifically, I represent scientists as members of a network (shown on the right) who each produce their own statistical research and share it with anyone they are connected to. I decipher the effects of different network structures as well as sharing, trusting, and updating strategies for the scientists. The result is novel, elegant explanations of multiple epistemic community issues such as polarization and group think. Unlike previous explanations, these rely solely on the nature of statistical data and its distribution, which works independently from whether or not the agents themselves are rational.
This model demostrates how strategies for trusting and connecting can influence the dynamics of information, and specifically variance, flow. I decouple strategies for connecting with other agents from strategies for weighting the information one hears from them, offering a more nuanced approach to concepts like homophily and heterophily.
In addition, I consider different strategies for sharing information, including full disclosure versus sharing only those that one believes are most epistemically beneficial. I show how seemingly innocent strategies like this one can lead to outcomes tantamount to fabricating results. In doing so, I introduce a useful way of modeling bias.
I decouple individuals' marginal benefit from aggregate benefit. I show that members of a community can benefit marginally across modal space while suffering in their aggregate. That is, they can increase their chances of individually voting correctly while decreasing the chances that the community vote is correct.
This demonstrates a sort of mereological conflict, that is, a conflict between individuals and those same individuals as a group. In particular, this is a modal conflict, one that occurs across modal space as opposed to being realizable in any particular scenario.
The variance of any random variable can be decomposed into three things:
Modal variance - differences between possible aggregate outcomes of a community
Social variance - differences between members of the community
Individual variance - differences between samples an individual observes
The model demonstrates this and how the decomposition depends on the random variable, testing power, community structure, as well as updating process of the agents.
This model incorporates both the paradigmatic information cascade and wise crowd models as two regimes of one more general model. I demonstrate how the difference between the two comes down to the dependence between their public votes. There are two main upshots:
First, the project shows how we can think of epistemic communities characterized phases and phase changes, much like physical systems.
Second, I show that there is a level of dependence (or trust) between community members that allow for cascades to nearly always be right. In other words, the members past a certain point can ignore evidence without epistemic detriment.
The two models below demonstrate a feature of how the variance inherent in scientific evidence manifests as either differences within the community (social variance) or unpredictability of the community as a whole (modal variance).
Both models involve a network of scientists attempting to estimate the true frequency of a random variable (RV).
Much of the above involves digging deeper into questions I originally explored in my dissertation.