Page 1 of 2
1
2
LastLast
  1. #1

    Programming Help needed

    I came across this problem and I was wondering if anyone could help get on the right path to solve it. I just need the right guidance in order to code this properly.

    Suppose that the dataset WORKERS is being used to send out two surveys to a randomly selected sample. You have been asked randomly assign approximately 10% of the individuals in this dataset to get the first survey. Another 5% of the individuals should be assigned to get the second survey. No individuals should be assigned to get both surveys. Write code that will accomplish this task.

    What is the right way to approach this problem and the math behind it? Let say the dataset is 1000.

    Thank you for your time.'

  2. #2
    If you have a hardcoded number you need to hit, thats pretty easy. Otherwise determine the number needed to get to 10 and 5%

    Generate random numbers between 1-1000, pick the dataset item out of the data and push it into a separate data structure. Do that while the total items in the data structure does not equal the number needed.

    Good luck on the homework.

  3. #3
    Merely a Setback PACOX's Avatar
    10+ Year Old Account
    Join Date
    Jul 2010
    Location
    ██████
    Posts
    26,375
    Like said, if you're sample size is known than you can just cheat a bit since you already know whats 10/5% of your sample size.

    Otherwise you need to first find your sample size and then multiple it by .1 and .05 to get your percentages. Then you can create a loop randomly picks the number 1 or 2, 1 being one survey and 2 being other. But before that you need to make sure more than 10/5% haven't already been assigned a survery, you can use simple counter variables that increment only when someone is assigned to a particular survey. If one counter is too high then you automatically assign people to the other group.

    The above could be implemented more efficiently but thats the basic idea. Mostly just a bunch of loops.

    Resident Cosplay Progressive

  4. #4
    @usiris made it easy for you. The hardest part is tracking who you've already sent the survey too. If you can do that, the rest is just RNG calculation within the range.

    - - - Updated - - -

    Quote Originally Posted by pacox View Post
    Like said, if you're sample size is known than you can just cheat a bit since you already know whats 10/5% of your sample size.

    Otherwise you need to first find your sample size and then multiple it by .1 and .05 to get your percentages. Then you can create a loop randomly picks the number 1 or 2, 1 being one survey and 2 being other. But before that you need to make sure more than 10/5% haven't already been assigned a survery, you can use simple counter variables that increment only when someone is assigned to a particular survey. If one counter is too high then you automatically assign people to the other group.

    The above could be implemented more efficiently but thats the basic idea. Mostly just a bunch of loops.
    So you're saying to "randomly" assign the first survey to the first 10% of the dataset, and the second survey to the next 5% of the dataset (or vice versa)? That doesn't sound...random.

  5. #5
    Thank you all for the help. I appreciate the feedback.

  6. #6
    Quote Originally Posted by Blueobelisk View Post
    @usiris made it easy for you. The hardest part is tracking who you've already sent the survey too. If you can do that, the rest is just RNG calculation within the range.

    The result my(and poc's) suggestion should get you three datasets; original data, survey one and survey two.

    Quote Originally Posted by Blueobelisk View Post
    So you're saying to "randomly" assign the first survey to the first 10% of the dataset, and the second survey to the next 5% of the dataset (or vice versa)? That doesn't sound...random.
    Na, hes just saying that you have to find out what 10 and 5% is.
    Last edited by usiris; 2016-12-11 at 05:39 AM.

  7. #7
    Stood in the Fire
    15+ Year Old Account
    Join Date
    Apr 2009
    Location
    Sweden
    Posts
    353
    i hope you are doing this in an oop language, but i would create a class which contains bools of dataSet 1 and 2.

    then i would do this, however, this CAN be dangerous, so make a breakout also

    Code:
    int counter
    
    while counter < requiredNumberOfWorkers (calculate this)
        for each worker
           if worker.has5percentSurvey and worker.has10percentSurvey
             continue
           else
             n = random(0, 100)
             if n < 5
                 if worker.has5percentSurvey then
                     continue;   // He already has made this survey
                 else
                     send5percentSurvey();
                     worker.has5percentSurvey = true;
                     counter++
                 end if
             else if n >= 5 and < 15
                 if worker has10percentSurvey then
                     continue
                 else
                     send10percentSurvey()
                     worker.has10percentSurvey = true
                     counter++
                 end if
    Last edited by zetitup; 2016-12-11 at 05:44 AM.

  8. #8
    Merely a Setback PACOX's Avatar
    10+ Year Old Account
    Join Date
    Jul 2010
    Location
    ██████
    Posts
    26,375
    Quote Originally Posted by Blueobelisk View Post
    @usiris made it easy for you. The hardest part is tracking who you've already sent the survey too. If you can do that, the rest is just RNG calculation within the range.

    - - - Updated - - -



    So you're saying to "randomly" assign the first survey to the first 10% of the dataset, and the second survey to the next 5% of the dataset (or vice versa)? That doesn't sound...random.
    No, you use a random number to add to the 10 or the 5. One participant is placed, then you pick another random number. Each time the runs it randomly selects one of the groups.

    1 = 10%
    2 = 5%

    loop
    randomly pick 1 or 2

    if 1

    add participant to 1

    if 2

    add participant to 2

    loop

    Resident Cosplay Progressive

  9. #9
    Quote Originally Posted by Forte View Post
    Thank you all for the help. I appreciate the feedback.
    I wonder why you'd have to make this thread on your alt account.

    Quote Originally Posted by usiris View Post
    The result my(and poc's) suggestion should get you three datasets; original data, survey one and survey two.
    Oh. Yeah I like that way too, deleting the worker from the dataset. I guess it depends on data structure though for performance times.

    Quote Originally Posted by pacox View Post
    No, you use a random number to add to the 10 or the 5. One participant is placed, then you pick another random number. Each time the runs it randomly selects one of the groups.

    1 = 10%
    2 = 5%

    loop
    randomly pick 1 or 2

    if 1

    add participant to 1

    if 2

    add participant to 2

    loop
    It sounds really unnecessary. I mean, I guess if the first and second surveys are supposed to simultaneously be assigned then your way is the only way to make it truly random. I don't like the idea of running an RNG just to pick 1 or 2 though. Sounds like a performance hit.

  10. #10
    Merely a Setback PACOX's Avatar
    10+ Year Old Account
    Join Date
    Jul 2010
    Location
    ██████
    Posts
    26,375
    Quote Originally Posted by Blueobelisk View Post
    I wonder why you'd have to make this thread on your alt account.



    Oh. Yeah I like that way too, deleting the worker from the dataset. I guess it depends on data structure though for performance times.



    It sounds really unnecessary. I mean, I guess if the first and second surveys are supposed to simultaneously be assigned then your way is the only way to make it truly random. I don't like the idea of running an RNG just to pick 1 or 2 though. Sounds like a performance hit.
    Oh for sure. I said that it could be done better but I figured OP was doing this for some class with just a basic understanding of whatever language he is using. I left out some stuff but zetitup elaborated on what I was meaning to illustrate.

    Resident Cosplay Progressive

  11. #11
    Deleted
    copy 1/10 of the workers from the original into an array, remove em from source array
    copy an amount of workers that corespond to 1/20 from the original, put em in a seperate array, remove em from source array

    interact with each respective array to give them their surveys

    or do arraylists if that suits your fancy to increase performance of iterating over them and changing values

    - - - Updated - - -

    Quote Originally Posted by zetitup View Post
    i hope you are doing this in an oop language, but i would create a class which contains bools of dataSet 1 and 2.

    then i would do this, however, this CAN be dangerous, so make a breakout also
    /snip/
    Not familair with whatever language that is (Suspect it's some form of C# or C++), but is the distribution chanse truly 5% and 10% in your example?

    Mostly thinking of the n = random(0, 100) - because if that includes 0, you have 101 options
    Last edited by mmoc411114546c; 2016-12-11 at 06:38 AM.

  12. #12
    Stood in the Fire
    15+ Year Old Account
    Join Date
    Apr 2009
    Location
    Sweden
    Posts
    353
    Quote Originally Posted by PvPHeroLulz View Post
    Not familair with whatever language that is (Suspect it's some form of C# or C++), but is the distribution chanse truly 5% and 10% in your example?
    That is called pseudo code, it applies to any language you prefer, it just describes how you should handle it

    Quote Originally Posted by PvPHeroLulz View Post
    Mostly thinking of the n = random(0, 100) - because if that includes 0, you have 101 options
    n = random(0, 100)
    includes the 0 and the way random works is between any two numbers - 1
    so if you want it to actually be random between 0 and 100 you have to do
    n = random(0, 100 + 1)

    but in the case i wrote n = random(0, 100) you will only get a result between 0 and 99, aka 100 options
    Hope that is a good explanation

    But to answer the question, yes it should be 5% and 10%, the reasoning for < 5 in this case and >= 5 and < 15 is to make sure you don't overlap the result into 0 - 4 in the 10% check and make sure the result is within the 5% and 10% chance, this is not the best way to calculate the percentage, but it is the easiest for this type of problem where you only need 2 different results
    Last edited by zetitup; 2016-12-11 at 07:05 AM.

  13. #13
    Deleted
    Quote Originally Posted by zetitup View Post
    That is called pseudo code, it applies to any language you prefer, it just describes how you should handle it



    n = random(0, 100)
    includes the 0 and the way random works is between any two numbers - 1
    so if you want it to actually be random between 0 and 100 you have to do
    n = random(0, 100 + 1)

    but in the case i wrote n = random(0, 100) you will only get a result between 0 and 99, aka 100 options
    Hope that is a good explanation
    I know what Psuedocode is, i just wondered what kind of code structure it is, cause i can't compare it to anyone i know, and i know of Java, C#, C++ and Python, as far as loop structures goes (my only two main areas is Java and Python)

    And i was mostly asking in terms of what the range was, because i wondered how that specific range would interact - I already know that Java has ways that actually includes the upperbound as well, whilst some randoms are actually just the double representation in form of 0.5 or what not, even when supplying a very big number

    But yeah, i was mostly wondering cuz i thought it was a real language, lol :P

  14. #14
    Quote Originally Posted by Ysilla View Post
    Everyone seems to focus on getting exactly 5% and 10%, that's not what's being asked. Simple solution that avoids all collision issues:

    Code:
    foreach (worker in workers)
      i = random(100)
      if (i < 10)
        assign survey A to worker
      else if (i < 15)
        assign survey B to worker
    done.
    This is fine for the 10% section, because that says explicitly "approximately 10%". However, the 5% section does not say approximately; it says 5%. So going by the specification provided you will have to add a cap to the 5% section, to ensure it doesn't go over the limit. And more complex, you will need to add a check to see if it is going to end up being less than 5%, to ensure that you force cases towards the end of the batch into that group to get the number up to exactly 5%. This second check will also need to take into account how many are in the 10% group, to ensure you don't cause too much variance in the 10% group by forcing cases into the 5% group. But my first question back to the person providing this spec would be "what variance is allowable on the 10%" followed by "why is there no variance allowed in the second group?"

    That's what comes from doing coding for actuaries for a long time!
    When challenging a Kzin, a simple scream of rage is sufficient. You scream and you leap.
    Quote Originally Posted by George Carlin
    Think of how stupid the average person is, and realize half of them are stupider than that.
    Quote Originally Posted by Douglas Adams
    It is a well-known fact that those people who must want to rule people are, ipso facto, those least suited to do it... anyone who is capable of getting themselves made President should on no account be allowed to do the job.

  15. #15
    Quote Originally Posted by Ysilla View Post
    5% has to be approximate too, what if there are 95 workers, do you assign 4.75 workers to B?

    So there's no solution, OP needs to ask for a clearer subject.
    Why does 5% need to be approximate? Point me to where it says that in the specification.

    Coding isn't about making up requirements because you think that's what they should be. That is when you end up producing something that isn't what was wanted. Especially if you are basing that "requirement" on something else that may or may not be true (like the number of workers).
    When challenging a Kzin, a simple scream of rage is sufficient. You scream and you leap.
    Quote Originally Posted by George Carlin
    Think of how stupid the average person is, and realize half of them are stupider than that.
    Quote Originally Posted by Douglas Adams
    It is a well-known fact that those people who must want to rule people are, ipso facto, those least suited to do it... anyone who is capable of getting themselves made President should on no account be allowed to do the job.

  16. #16
    Quote Originally Posted by Ysilla View Post
    That's fun, this is EXACTLY the issue I pointed out here. You can't solve the problem unless you assume some requirements that are not part of the specifications from OP's post: workers must be a multiple of 20 or "5% of workers" doesn't make sense if it has to be accurate (or we need to know if we assume workers is a multiple of 20, or how that 5% should be rounded if not). Since this is not clearly specified, he needs better specifications, which is exactly what I said.
    Which is fine, I agree with you that the specifications need clarification. But I don't agree with your statement that 5% "has to be approximate". Because you don't know that from the information given. It might be, it might not. We don't know. And understanding precisely what you do and do not know is a key part of being a programmer.
    When challenging a Kzin, a simple scream of rage is sufficient. You scream and you leap.
    Quote Originally Posted by George Carlin
    Think of how stupid the average person is, and realize half of them are stupider than that.
    Quote Originally Posted by Douglas Adams
    It is a well-known fact that those people who must want to rule people are, ipso facto, those least suited to do it... anyone who is capable of getting themselves made President should on no account be allowed to do the job.

  17. #17
    Stood in the Fire
    15+ Year Old Account
    Join Date
    Apr 2009
    Location
    Sweden
    Posts
    353
    I'm pretty sure the exercise is an assignment for school and not something business related :P
    Otherwise i could do an extremly well written with no errors function, but i doubt that it is the case, i can bet my shaman on that it has to do with showing how you think as a programmer and give a working solution for the current problem and nothing else.

  18. #18
    Quote Originally Posted by zetitup View Post
    I'm pretty sure the exercise is an assignment for school and not something business related :P
    Otherwise i could do an extremly well written with no errors function, but i doubt that it is the case, i can bet my shaman on that it has to do with showing how you think as a programmer and give a working solution for the current problem and nothing else.
    And as a couple of us have stated; if you are thinking like a programmer you would be saying "I can't code this, the specification is incomplete/inaccurate/contradictory". You can code something based on what they have asked for, but there is no guarantee it will end up being what they actually want.

    This feels to me like a coding exercise written by someone that isn't a coder, or at least isn't a very good one. I used to see this kind of thing a lot when I was in college back in the 1980s.
    When challenging a Kzin, a simple scream of rage is sufficient. You scream and you leap.
    Quote Originally Posted by George Carlin
    Think of how stupid the average person is, and realize half of them are stupider than that.
    Quote Originally Posted by Douglas Adams
    It is a well-known fact that those people who must want to rule people are, ipso facto, those least suited to do it... anyone who is capable of getting themselves made President should on no account be allowed to do the job.

  19. #19
    Deleted
    Quote Originally Posted by zetitup View Post
    i hope you are doing this in an oop language, but i would create a class which contains bools of dataSet 1 and 2.

    then i would do this, however, this CAN be dangerous, so make a breakout also

    Code:
    int counter
    
    while counter < requiredNumberOfWorkers (calculate this)
        for each worker
           if worker.has5percentSurvey and worker.has10percentSurvey
             continue
           else
             n = random(0, 100)
             if n < 5
                 if worker.has5percentSurvey then
                     continue;   // He already has made this survey
                 else
                     send5percentSurvey();
                     worker.has5percentSurvey = true;
                     counter++
                 end if
             else if n >= 5 and < 15
                 if worker has10percentSurvey then
                     continue
                 else
                     send10percentSurvey()
                     worker.has10percentSurvey = true
                     counter++
                 end if
    This approach might have more or less than 5% / 10% surveys sent.

    I'd first build a list of targets and send them a survey instead:

    (pyhton)

    Code:
    workers = ["John", "Peter", "Kevin", "Jasmine", "Caterine", "Jhonny", "Electronium10000"]
    survey_1_percentrequired = 5
    survey_2_percentrequired = 10
    
    int survey_1_amount = math.trunc(workers.size * survey_1_percentrequired /100)
    int survey_2_amount = math.trunc(workers.size * survey_2_percentrequired /100)
    
    while survey_1_targets.size() < survey_1_amount
    	try = rand(workers.size)
    	if (try not in survey_1_targets): survey_1_targets.append(try)	
    	
    while survey_2_targets.size() < survey_2_amount
    	try = rand(workers.size)
    	if (try not in survey_1_targets): 
    		if (try not in survey_2_targets): survey_2_targets.append(try)
    
    for (target in survey_1_targets): Sendsurvey1(target)
    for (target in survey_2_targets): Sendsurvey2(target)

    The target parameter passed to Sendsurvey1/Sendsurvey2 functions is the position in the list, you can the access the Workers data with that.
    Last edited by mmoc00230c3bbe; 2016-12-11 at 10:57 AM.

  20. #20
    Quote Originally Posted by usiris View Post
    If you have a hardcoded number you need to hit, thats pretty easy. Otherwise determine the number needed to get to 10 and 5%

    Generate random numbers between 1-1000, pick the dataset item out of the data and push it into a separate data structure. Do that while the total items in the data structure does not equal the number needed.
    Almost, but after picking one there are only 999 element - which makes it a bit messy. You could skip it if already selected - or you could remove the element, but you normally don't have data-structures where you both can remove elements in the middle and index efficiently.

    But a simpler solution is to follow C++'s random_shuffle and pick a number 1-1000 called 'j', swap the first and the j:th element (if j is 1 don't do anything) - then select a number 2-1000 (called 'j' again) and swap the second and the j:the element etc; until you have 5% and 10% of the elements at the front.

    - - - Updated - - -

    Quote Originally Posted by Huehuecoyotl View Post
    And as a couple of us have stated; if you are thinking like a programmer you would be saying "I can't code this, the specification is incomplete/inaccurate/contradictory". You can code something based on what they have asked for, but there is no guarantee it will end up being what they actually want.
    That is often the case in real life. People ask for the wrong solution, specified incorrectly, in order to solve the wrong problem.
    Last edited by Forogil; 2016-12-11 at 11:02 AM.

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •