Thread: SQL Query Help

  1. #1

    SQL Query Help

    Hey guys. I have a question regarding advanced SQL querying. I saw a question like this being asked in a job interview for some company, and I myself have no idea how to solve it.

    You're given two tables. The second table is a list of Divisions within a company. Each division has a name and a unique ID. The first table is a list of employees. Each employee has an ID (unique), name, gender, and a Division ID that's associated (is a foreign key) with the Divisions table.

    Given these two tables, create a query that lists each division name along with the number of employees in that division. (In descending order. Any times in number of employees is sorted alphabetically.)

    Here's a table I drew as an example:



    And here's a sample output for that table:

    Code:
    Accounting 2
    HR 2
    Finances 1
    IT 0
    PR 0
    Sales 0

    -----

    It's easy to find basic query help, but I don't know about these advanced queries, so if anyone has any resources (books, tutorials, websites) for learning this advanced level SQL I'd be grateful if they posted it, as well.
    Last edited by Blueobelisk; 2017-11-06 at 04:14 AM.

  2. #2
    Hi there,

    I thought I'd give you an example of how to solve this using Oracle.

    First some test data:

    create table employee (
    id number,
    name varchar2(20),
    gender varchar2(1),
    division_id number);

    create table divisions (
    divisionid number,
    name varchar2(20));

    insert into employee values (0,'Lily','F',4);
    insert into employee values (1,'Dave','M',2);
    insert into employee values (2,'Beth','F',4);
    insert into employee values (3,'Bill','M',1);
    insert into employee values (4,'Carl','M',1);

    insert into divisions values (1,'HR');
    insert into divisions values (2,'Finances');
    insert into divisions values (3,'IT');
    insert into divisions values (4,'Accounting');
    insert into divisions values (5,'Sales');
    insert into divisions values (6,'PR');

    And then the query:

    select b.name, nvl(amt,0) emps from ( -- the two columns you want to show, with an alias on the count column
    select count(1) amt, divisions.name -- the two columns you want to show
    from divisions, employee -- the tables included in the query
    where employee.division_id = divisions.divisionid -- joining the two tables so we know which employees are connected to the divisions
    group by divisions.name -- need to group on something when doing a count
    ) a, divisions b -- putting an alias on the subquery, and joining the results from that with divisions again
    where a.name(+) = b.name -- left outer join between the subquery and the divisions table to get all divisions, regardless of count
    order by emps desc, -- order by amount of employees (emps) descending order
    b.name asc; -- and names in ascending order, to get the list in your desired way

    The query does grow a bit once you require to show empty divisions, but I guess it prevents stupid questions like (where are the other divisions?) etc.


    Hope this helps

  3. #3
    What flavor of SQL? There's quite a few of them...

    Though these three things are what you need to do at the very least (You might need to do a right or full join depending on how you setup the data).

    https://www.w3schools.com/sql/sql_count_avg_sum.asp
    https://www.w3schools.com/sql/sql_join_left.asp
    https://www.w3schools.com/sql/sql_groupby.asp
    Last edited by Woffie; 2017-11-06 at 06:21 AM.
    Part of the multi legendary crowd. Resto Drood from Silvermoon-EU Nesandur

  4. #4
    Elemental Lord callipygoustp's Avatar
    7+ Year Old Account
    Join Date
    Jun 2015
    Location
    Buffalo, NY
    Posts
    8,667
    Joins are considered advanced? I don't think so.

    Regardless, w3schools is a solid free option for learning basics:
    https://www.w3schools.com/sql/

  5. #5
    Your table design is flawed.

    while you could use a second table for your divisions. its just not necessary.
    its a 1 column table. there is no information to be saved when putting it inside a second table instead of directly putting that division info in the employee table.

    Your table Division is actually just an Enum.
    1= HR
    2 = Finance
    3 = IT
    thats an enum

    there is no need to save the string representation of the enum inside the database.
    Just have it in your code.
    One column to express the division is enough - you dont need number + string
    In your code when you read out the employee you can easily replace the number of the enum to its string representation for output.

    If you do it in one table your query also gets significantly easier.

    select distinct count(divisionId) from employees.
    The rest you do inside your code.


    Why is my solution better?

    Less storage = less costs
    Less complex queries

    Thats of course only true if you build production code. if your task is to do it exactly like you described because some stupid university professor thought that bad examples are good tasks then .. sorry my help is worthless^^
    Last edited by Jalatiphra; 2017-11-06 at 06:35 AM.

  6. #6
    Quote Originally Posted by Rabatami View Post
    And then the query:

    select b.name, nvl(amt,0) emps from ( -- the two columns you want to show, with an alias on the count column
    select count(1) amt, divisions.name -- the two columns you want to show
    from divisions, employee -- the tables included in the query
    where employee.division_id = divisions.divisionid -- joining the two tables so we know which employees are connected to the divisions
    group by divisions.name -- need to group on something when doing a count
    ) a, divisions b -- putting an alias on the subquery, and joining the results from that with divisions again
    where a.name(+) = b.name -- left outer join between the subquery and the divisions table to get all divisions, regardless of count
    order by emps desc, -- order by amount of employees (emps) descending order
    b.name asc; -- and names in ascending order, to get the list in your desired way

    The query does grow a bit once you require to show empty divisions, but I guess it prevents stupid questions like (where are the other divisions?) etc.


    Hope this helps
    This query seems strangely complicated. A simpler version, by the SQL standard (should work both for Oracle and other engines):

    Select d.name, count(e.id) employee_count from divisions d left join employee e on e.division_id = d.divisionid
    group by d.divisionid, d.name order by employee_count desc;

    This has been tested in MySQL. Should work in Oracle too. The "d.name" in order clause is there specifically for Oracle (as it doesn't allow columns in select clause which don't appear in group by) and can be omitted in MySQL. Count must be done with (e.id), and not (1) or (*) - that way it doesn't count empty rows rfom employee table, grabbed by outer join.

    - - - Updated - - -

    Quote Originally Posted by Jalatiphra View Post
    Your table design is flawed.

    while you could use a second table for your divisions. its just not necessary.
    its a 1 column table. there is no information to be saved when putting it inside a second table instead of directly putting that division info in the employee table.
    No point arguing it, that was a test assignment he was given. The divisions table could contain a lot of columns which are not important for the assignment.
    I have enough of EA ruining great franchises and studios, forcing DRM and Origin on their games, releasing incomplete games only to sell day-1 DLCs or spill dozens of DLCs, and then saying it, and microtransactions, is what players want, stopping players from giving EA games poor reviews, as well as deflecting complaints with cheap PR tricks.

    I'm not going to buy any game by EA as long as they continue those practices.

  7. #7
    Quote Originally Posted by procne View Post
    This query seems strangely complicated. A simpler version, by the SQL standard (should work both for Oracle and other engines):

    Select d.name, count(e.id) employee_count from divisions d left join employee e on e.division_id = d.divisionid
    group by d.divisionid, d.name order by employee_count desc;

    This has been tested in MySQL. Should work in Oracle too. The "d.name" in order clause is there specifically for Oracle (as it doesn't allow columns in select clause which don't appear in group by) and can be omitted in MySQL. Count must be done with (e.id), and not (1) or (*) - that way it doesn't count empty rows rfom employee table, grabbed by outer join.

    - - - Updated - - -



    No point arguing it, that was a test assignment he was given. The divisions table could contain a lot of columns which are not important for the assignment.
    You are absolutely right, and it is a much better query to solve the same problem. I think that in my 7 years of working with Oracle I never had to show zero counts, so I can happily admin I learned something new today

  8. #8
    Quote Originally Posted by Rabatami View Post
    Hope this helps
    Woah. I'll have to read it later.

    Quote Originally Posted by Woffie View Post
    What flavor of SQL? There's quite a few of them...
    Oh uhhhhhhhhhh. UHM. To be honest I don't remember the choices. I'm pretty sure it was MySQL and three others.

    Quote Originally Posted by Jalatiphra View Post
    Your table design is flawed.
    Quote Originally Posted by procne View Post
    No point arguing it, that was a test assignment he was given. The divisions table could contain a lot of columns which are not important for the assignment.
    Yeah, it wasn't my design. A company has a hiring process of giving online coding tests to screen candidates. They asked 3 coding questions (any language, fast algorithm necessary) + 1 DB question in an hour. The post I made is basically the same question but worded a bit differently and with different data that I made up.

    - - - Updated - - -

    Quote Originally Posted by Woffie View Post
    Though these three things are what you need to do at the very least (You might need to do a right or full join depending on how you setup the data).

    https://www.w3schools.com/sql/sql_count_avg_sum.asp
    https://www.w3schools.com/sql/sql_join_left.asp
    https://www.w3schools.com/sql/sql_groupby.asp
    Quote Originally Posted by callipygoustp View Post
    Joins are considered advanced? I don't think so.

    Regardless, w3schools is a solid free option for learning basics:
    https://www.w3schools.com/sql/
    Ah thanks for the link. I read through a few pages last night and will continue reading more today.

    - - - Updated - - -

    Quote Originally Posted by procne View Post
    Code:
    Select d.name, count(e.id) employee_count from divisions d left join employee e on e.division_id = d.divisionid
    group by d.divisionid, d.name order by employee_count desc;
    :O I didn't test it yet but thanks!

  9. #9
    Quote Originally Posted by Jalatiphra View Post
    Your table design is flawed.

    while you could use a second table for your divisions. its just not necessary.
    its a 1 column table. there is no information to be saved when putting it inside a second table instead of directly putting that division info in the employee table.

    Your table Division is actually just an Enum.
    1= HR
    2 = Finance
    3 = IT
    thats an enum

    there is no need to save the string representation of the enum inside the database.
    Just have it in your code.
    One column to express the division is enough - you dont need number + string
    In your code when you read out the employee you can easily replace the number of the enum to its string representation for output.

    If you do it in one table your query also gets significantly easier.

    select distinct count(divisionId) from employees.
    The rest you do inside your code.


    Why is my solution better?

    Less storage = less costs
    Less complex queries

    Thats of course only true if you build production code. if your task is to do it exactly like you described because some stupid university professor thought that bad examples are good tasks then .. sorry my help is worthless^^
    Well, sometimes it's easier to make changes and such when you have the divisions in its own table. It would also be easier to implement other tables that might need to use only the divisions names/IDs.

  10. #10
    Quote Originally Posted by Jalatiphra View Post
    Your table design is flawed.

    while you could use a second table for your divisions. its just not necessary.
    its a 1 column table. there is no information to be saved when putting it inside a second table instead of directly putting that division info in the employee table.

    Your table Division is actually just an Enum.
    1= HR
    2 = Finance
    3 = IT
    thats an enum

    there is no need to save the string representation of the enum inside the database.
    Just have it in your code.
    One column to express the division is enough - you dont need number + string
    In your code when you read out the employee you can easily replace the number of the enum to its string representation for output.

    If you do it in one table your query also gets significantly easier.

    select distinct count(divisionId) from employees.
    The rest you do inside your code.


    Why is my solution better?

    Less storage = less costs
    Less complex queries

    Thats of course only true if you build production code. if your task is to do it exactly like you described because some stupid university professor thought that bad examples are good tasks then .. sorry my help is worthless^^
    Honestly,,, Yes, you could skip the numbering of the departments... However, you need something to act as a KEY. Its part of the Normalization of data. Also, your approach is flawed. What if someone adds in a new department, without the ID's explicitly listed, a lot of people suddenly shifted their departments.

    Its a clever trick, but its not a robust one. And what would happen if for whatever reason your old code doesn't work anymore and you've lost the sourcecode? Sides, without the ID to link the two tables together, you can't do joins from within the database. Sides, what if the task is to get All information available about Dave? Your solution would require two different requests, plus calculations on the clientside... Vs a simple Join + Select "Stuff" from Table where name = 'Dave' .

    Especially if there's multiple programs accessing the database, some of them where you can't change how they process the data.
    Part of the multi legendary crowd. Resto Drood from Silvermoon-EU Nesandur

  11. #11
    Quote Originally Posted by Synthaxx View Post
    Just wait until you need to do subqueries or nested subqueries, they're "fun"
    Joins, Outer Applies and Unions oh my!

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •