{
"metadata": {
"name": "119-HOMEWORK"
},
"nbformat": 3,
"nbformat_minor": 0,
"worksheets": [
{
"cells": [
{
"cell_type": "heading",
"level": 1,
"metadata": {},
"source": [
"Social Network Evaluation on SNAP Data"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"There is a large collection of social network data available here:\n",
"\n",
"https://snap.stanford.edu/data/index.html\n",
"\n",
"- download the ego-GPlus, ego-Twitter, and soc-Slashdot0922 databases\n",
"- using networkx, compute the social network statistics discussed in class\n",
"- write up and discuss your results\n",
"\n",
"Note that this is deliberately left at a high level: try to find interesting patterns in the data, differences between the social networks, and explanations for those differences. If you are not familiar with one or more of these networks, you may want to sign up for them and try them out."
]
},
{
"cell_type": "heading",
"level": 1,
"metadata": {},
"source": [
"Power of Different Tests"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Your task is to determine the relative power of the $t$-test, the $U$-test, and a random sampling version of the permutation test on unimodal distributions. Obvious questions to ask are:\n",
"\n",
"- Assuming the data is normal, what significance levels do you get for the same data?\n",
"- How do the relative powers depend on the size of the data set?\n",
"- How do the relative powers depend on the size of difference in means?\n",
"- How does the $t$-test do when applied to non-normal distributions?\n",
"\n",
"You should probably also make sure that you are using enough random samples in the permutation test; how many is enough?\n",
"\n",
"Perform numerical experiments using random sampling similar to that used in class.\n",
"\n",
"Select your experiments carefully in order to come up with some general recommendations.\n",
"General recommendations might look like:\n",
"\n",
"- \"If you know your data is normal, the $t$ test performs significantly better for all sample sizes and differences in mean.\" ???\n",
"- \"For large differences in mean, the $U$ test is nearly as good as the $t$ test at all sample sizes.\" ???\n",
"- \"The permutation test and the $U$ test perform about the same for normal data.\" ???\n",
"\n",
"Consider at least the normal distribution, the uniform distribution over an interval, and the Cauchy distribution."
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [],
"language": "python",
"metadata": {},
"outputs": []
}
],
"metadata": {}
}
]
}