• SophistiCat
    2.2k
    Here are a few stats that I scrubbed from the public member list.

    • Date: 10-08-2020
    • Total posts: 432,641
    • Registered users: 7,603
    • Users with no posts: 64%
    • Posters (>0 posts) with fewer than 10 posts: 63%
    • Posters with fewer than 626 posts: 95%
    • Posters with a single post ("drive-by"): 25%

    Note that these stats include banned and inactive users.

    (Click on the links to see graphs)

    Number of posts by join date scatter

    Number of posts pie chart

    Top posters
  • Sir2u
    3.2k
    I am not great at math nor statistics, but I don't think there are 222% of users here in the forum. :smirk:
  • Caldwell
    1.3k
    ↪SophistiCat I am not great at math nor statistics, but I don't think there are 222% of users here in the forum. :smirk:Sir2u

    :grin: No, SophistiCat's stats did not add up to 222%. Please read it again.
  • SophistiCat
    2.2k
    Corrected one figure: 95% of posters have < 626 posts (was 130). Added stat drive-bys.

    Note that "posters" are a subset of "users," and posters with < 10 posts are a subset of posters with < 626 posts.
  • Sir2u
    3.2k
    :grin: No, SophistiCat's stats did not add up to 222%. Please read it again.Caldwell

    But it appears to.

    Note that "posters" are a subset of "users," and posters with < 10 posts are a subset of posters with < 626 posts.SophistiCat

    As I said, I am not that great at statistics and the manner of presentation is I think unusually as well. In the little studying I ever did of statistics (not that much in engineering) it was always made clear to us that the percentages should always add up to 100%

    One would expect to see something like the following to give a true representation of the data.

    Registered users: 7,603
    Users with no posts: 64% = 4,865
    Etc, etc. until reaching 100%

    Including subsets within subsets blurs the reality of the information. One would, I think, like to see statistics that give a clear picture of the quantity of each group, and the total does make more sense when it adds up to 100%
  • Pantagruel
    3.3k
    Including subsets within subsets blurs the reality of the information. One would, I think, like to see statistics that give a clear picture of the quantity of each group, and the total does make more sense when it adds up to 100%Sir2u

    Percentage of people who like green eggs: 71%
    Percentage of people who like ham: 85%
    Percentage of people who like Dr. Seuss: 93%
    etc., etc., etc..

    The stats weren't presented as an analysis of the collective, but of the properties of individuals. No wonder I have so much difficulty offering systems theoretical arguments.
  • fdrake
    5.9k
    Statistics with fdrake (readers may or may not receive a free car upon reading the post).

    But it appears to.Sir2u

    Posters (>0 posts) with fewer than 10 posts: 63%
    Posters with fewer than 626 posts: 95%
    Posters with a single post ("drive-by"): 25%
    SophistiCat

    These categories aren't mutually exclusive or exhaustive. If you have 1 post, you have <10 posts. If you have <10 posts, you have <626 posts. A list of percentages is only ensured to add up to 100% when they represent a mutually exclusive and exhaustive collection of properties regarding the same population. Exhaustive means everything is counted, mutually exclusive means things are only counted once.

    Example - population of people in each continent except Antartica.

    List 1

    1 Asia 59.69%
    2 Africa 16.36%
    3 Europe 9.94%
    4 North America 7.79%
    5 South America 5.68%
    6 Oceania 0.54%

    These add to 100% because people who live somewhere in a populated continent live in one of the continents. The list's items exhaustive, it covers all the population (of people in the populated continents).

    And if you currently are in Asia, you can't also currently be in Africa or Europe or North America and so on. The list's items are mutually exclusive, you can only ever be in one list item.

    If you added Afro-Eurasia to the list - it has 85.90% of the world population. But then it would read:

    List 2
    1 Asia 59.69%
    2 Africa 16.36%
    3 Europe 9.94%
    4 North America 7.79%
    5 South America 5.68%
    6 Oceania 0.54%
    7 Afro-Eurasia 85.90%

    Now they they add up to 185.90%. But if you live in Afro-Eurasia, you can live anywhere in Europe or Asia or... It breaks the mutually exclusive thing. Since Asians, Africans, Europeans are counted in their list entries but also in Afro-Eurasia.

    If instead you delete Europe from the original list:

    List 3
    1 Asia 59.69%
    2 Africa 16.36%
    3 Europe 9.94%
    4 North America 7.79%
    5 South America 5.68%
    6 Oceania 0.54%

    It adds up to 90.06%. This breaks the exhaustive thing - you can live in Europe but not be on the list.

    The reason it's generally expected that lists of %s sum to 100% is that lists of % are generally used to represent a mutually exclusive and exhaustive collection of properties (like List 1). Sophisticat's collection of properties don't have that property since:

    If you have 1 post, you have <10 posts. If you have <10 posts, you have <626 posts.

    Having <626 posts behaves like "being in Afro-Eurasia" in list 2, it contains all the <1 items (since 0 is less than 626) and all the <10 items (since 10 is less than 626).
  • SophistiCat
    2.2k
    Would It help to put it this way?

    2/3 registered users have never posted
    Of those who have posted, 2/3 have under 10 posts

    Or if you like pies (who doesn't?) Edit: added to the OP
  • Michael
    14.2k
    Can you do a growth chart based on user sign up date?

    Also it might just be me but all I see is this:

    6uwaaaew0kq48d6p.png

    A direct link to the image gives a 403 "Your client does not have permission to get URL..."
  • Jamal
    9.2k
    Also it might just be me but all I see is this:Michael

    A direct link to the image gives a 403 "Your client does not have permission to get URL..."Michael

    Me too. Maybe only you can see them @SophistiCat.
  • Sir2u
    3.2k
    Percentage of people who like green eggs: 71%
    Percentage of people who like ham: 85%
    Percentage of people who like Dr. Seuss: 93%
    etc., etc., etc..
    Pantagruel

    Now this is much clearer, nothing like the other set. It is obvious that we have a hundred percent of people and that a percentage of them like certain things. No subsets involved, just different objects to like or dislike.

    The stats weren't presented as an analysis of the collective, but of the properties of individuals. No wonder I have so much difficulty offering systems theoretical arguments.Pantagruel

    But as an individual I might fall into several of those groups.

    Posters (>0 posts) with fewer than 10 posts: 63% - if I have made 1 post I am in this group
    Posters with fewer than 626 posts: 95% - if I have made 1 post I am in this group
    Posters with a single post ("drive-by"): 25% - if I have made 1 post I am in this group

    As I said, the data is not really clear.
  • Sir2u
    3.2k
    Would It help to put it this way?

    2/3 registered users have never posted
    Of those who have posted, 2/3 have under 10 posts
    SophistiCat

    Here again you are creating sub groups to try to explain data.

    Or if you like pies (who doesn't?)SophistiCat

    Pie charts are much easier to understand, because the are based 100% and split into parts equivalent to the percentages. Have you ever tried to create a pie where subgroups appear in several places in the chart?
  • Caldwell
    1.3k
    Pie charts are much easier to understand, because the are based 100% and split into parts equivalent to the percentages. Have you ever tried to create a pie where subgroups appear in several places in the chart?Sir2u

    Venn Diagram will work here, to present a visual overlapping of counts.
  • Sir2u
    3.2k
    Venn Diagram will work here, to present a visual overlapping of counts.Caldwell

    But they are not that good at representing percentages of a whole.
  • Caldwell
    1.3k
    Including subsets within subsets blurs the reality of the information. One would, I think, like to see statistics that give a clear picture of the quantity of each group, and the total does make more sense when it adds up to 100%Sir2u

    Admittedly, yes. If you read polls by Gallup, for example, they try to make it as basic as possible for the general public. If you want to use individuals in more than one attributes, keep making header titles that highlight the different attributes.
  • SophistiCat
    2.2k
    Don't get hung up on this; there's more than one way to present data, depending on what you want to highlight. Posters are a distinct category, because they are who you actually see and interact with on the forum, so it made sense to me to split that category, instead of the overall number of registered users. under 10 posts seemed to me like representative group before I did the actual count - and so it turned out to be. 95% is kind of a magic number that statisticians like to use. And single-posters are an interesting outlier in themselves; I expected to see a lot of these, but not quite as many.

    Shouldn't try to cheat Google :) Added links instead.
  • Sir2u
    3.2k
    Don't get hung up on this;SophistiCat

    I ain't. :up:
bold
italic
underline
strike
code
quote
ulist
image
url
mention
reveal
youtube
tweet
Add a Comment

Welcome to The Philosophy Forum!

Get involved in philosophical discussions about knowledge, truth, language, consciousness, science, politics, religion, logic and mathematics, art, history, and lots more. No ads, no clutter, and very little agreement — just fascinating conversations.