Page 4 of 8 FirstFirst ...
2
3
4
5
6
... LastLast
  1. #61
    Quote Originally Posted by Cracked View Post
    You have zero idea what you are talking about. Apparently you have never heard about function inlining, and you have no idea how to do things efficiently.

    here is just my raw take on doing an eps comparison, assuming your numbers are in eax and ebx (if you are working with them, they most likely are) and your eps is in edx, as it's a convenient place to store it in for many comparisons.
    sub eax ebx
    mov ebx eax
    neg eax
    cmovl eax ebx
    cmp eax edx

    so that's 5 simple instructions, and I have no doubt it could be done way better. As to your remark "almost-invisible bugs", no. If it results in a bug for you, you don't know what you are doing.

    - - - Updated - - -

    And yes, you can pipeline that with no stalls on modern x86 processors.
    My dear, inlining and pipelining are two completely different concepts. They are not even remotely related and frankly, inlining is irrelevant to this discussion. Moving on, you are assuming your values are stored in two registers, how did you load those values to registers? There are at least two additional load instruction there. Also, I have no access to the design document of some main stream CPU, I don't know how their algorithms are functioning and how they are pipelining, etc. But as a general rule of thumb, same operators in a sequential manner are not pipelinable, branches are strictly NOT pipelinable, unless predicted and the assembly code is reconfigured. You have a (missing) branch instruction in your code (after CMP). Your code is incomplete.
    Last edited by Kuntantee; 2016-05-18 at 01:42 PM.

  2. #62
    Quote Originally Posted by Cracked View Post
    You have zero idea what you are talking about. Apparently you have never heard about function inlining, and you have no idea how to do things efficiently.

    here is just my raw take on doing an eps comparison, assuming your numbers are in eax and ebx (if you are working with them, they most likely are) and your eps is in edx, as it's a convenient place to store it in for many comparisons.
    sub eax ebx
    mov ebx eax
    neg eax
    cmovl eax ebx
    cmp eax edx

    so that's 5 simple instructions, and I have no doubt it could be done way better. As to your remark "almost-invisible bugs", no. If it results in a bug for you, you don't know what you are doing.

    - - - Updated - - -

    And yes, you can pipeline that with no stalls on modern x86 processors.
    You should really stop arguing.

  3. #63
    Quote Originally Posted by rda View Post
    You just write your computations in a way that tries to make sure errors don't snowball beyond control (limits on recursive math, etc) and live with it.
    They aren't really predictable, because you don't know the number stored in your variable.

  4. #64
    My dear, branching is not anyhow related to the discussion we are having. Inlining was a remark to your function call for eps. Values that you are working with are gonna be stored in registers, that's a fact.

    In any way, even branches are ALWAYS pipelinded and the pipeline is partially cleared if the prediction turns out to be incorrect.

    All you wrote only confirms my suspicion that you have very superficial knowledge of how computers work. Therefore, I end my part in this discussion here.

  5. #65
    Quote Originally Posted by Kuntantee View Post
    I am not sure how you wouldn't say that when the implementation in c++ aren't precise and you can't use equality operator for comparison. You need to introduce a class and overload == operator (fixed-point number) or implement a equality method using an epsilon to compare. The overhead is a result of extra function call in the hotspot. If you can't see this, I am afraid you aren't a good programmer.

    By the way, IEEE Standard for Floating-Point Arithmetic (IEEE 754) is a hardware implementation standard., not software.
    It's, but intrinsics' availability, that are normally required to work w/ doubles or floats efficiently (and more or less painlessly), depends on available hardware, SSE* and shit. The only overhead I can think of rn is moving data from SSE units to ALU registers. No one, who thinks about performance, would do anything including == operator. Epsilon comps are poor man solution.

    That's why work w/ floats depends on both software implementation and hardware you run your piece of software on. We have no fucking idea about either, we're speculating rn.
    Last edited by ls-; 2016-05-18 at 01:45 PM.

  6. #66
    Quote Originally Posted by lightspark View Post
    Epsilon comps are poor man solution.
    It is the only solution available to high-level programming.

  7. #67
    Quote Originally Posted by Kuntantee View Post
    My dear, function inlining and pipelining are two completely different concepts. They are not even remotely related and frankly, inlining is irrelevant to this discussion. Moving on, you are assuming your values are stored in two registers, how did you load those values to registers? There are at least two additional load instruction there. Also, I have no access to the design document of some main stream CPU, I don't know how their algorithms are functioning and how they are pipelining, etc. But as a general rule of thumb, same operators in a sequential manner are not pipelinable, branches are strictly NOT pipelinable, unless predicted and the assembly code is reconfigured. You have a (missing) branch instruction in your code (after CMP).
    Look, he has a point in that comparing two doubles with a hard-set (or even relative) precision is not the end of the world and most of the instructions *would* execute in parallel. Yes, there would be branch prediction, the easily predictable branch in most cases is the one for "not equal" unless the profiles show otherwise. The register loads that you mention would have to be there even if the numbers were ints, they are free in that sense.

    There isn't much point grappling over this.

    You are right that comparing floating-point values for equality is tricky and you are right that in places where this should be done, this should be done with care and it would be more costly than comparing integer values. But it wouldn't have much of an effect on anything, because it is rare that you have to do this at all, you mostly compare for which is greater, not for equality.

    - - - Updated - - -

    Quote Originally Posted by Kuntantee View Post
    They aren't really predictable, because you don't know the number stored in your variable.
    You know the range and that, together with operations you are performing (and the ranges of coefficients) gives you the range of errors. In practice, this (making sure errors don't snowball out of control) boils down to a few rules of thumb.
    Last edited by rda; 2016-05-18 at 01:48 PM.

  8. #68
    This is slowly turning into an episode of "Silicon Valley".
    Atoms are liars, they make up everything!

  9. #69
    Quote Originally Posted by Cracked View Post
    My dear, branching is not anyhow related to the discussion we are having. Inlining was a remark to your function call for eps. Values that you are working with are gonna be stored in registers, that's a fact.
    Fine, they will be inlined. It was a mistake to mention function calls.

    Quote Originally Posted by Cracked View Post
    In any way, even branches are ALWAYS pipelinded and the pipeline is partially cleared if the prediction turns out to be incorrect.
    Thanks for proving my point. Any code involving branches (your code, but you forgot to add branch there for some odd reasons) is not guaranteed to be pipelined, effectively making using comparison function more expensive.

    You are just destroyed my friend. Let me remind you your original statement one more time.

    And I know everything is pumped to pipeline and cleared once turned out to be incorrect prediction. There is no point to call a failed prediction "pipelining", because it isn't being pipelined. There are other methods, like parallel pipelining etc.

    Quote Originally Posted by Cracked View Post
    then you might go for \eps equality, which is still 3 fp instructions (which are pipelined and executed at the same time anyway).

    Quote Originally Posted by Cracked View Post
    All you wrote only confirms my suspicion that you have very superficial knowledge of how computers work. Therefore, I end my part in this discussion here.
    You wrote and incomplete code, leave the most important part -- the branch instruction -- in a discussion regarding pipeline and now you are telling me I only have superficial knowledge? Me spotting your stupid errors (intentional, or just due to being a terrible programmer) isn't enough for you to understand the level of sophisticated knowledge I have access to?
    Last edited by Kuntantee; 2016-05-18 at 01:58 PM.

  10. #70
    Quote Originally Posted by Kuntantee View Post
    It is the only solution available to high-level programming.
    Working w/ intrinsics isn't really assembly coding though, in this particular case it's muuch closer to high-level programming and you can use them in high-level language...
    Last edited by ls-; 2016-05-18 at 02:00 PM. Reason: clarified...

  11. #71
    But there is no branch. At the end of that code you have the result of the eps comparison in the sign flag. So I rest my case.

  12. #72
    Quote Originally Posted by Kuntantee View Post
    You wrote and incomplete code, leave the most important part -- the branch instruction -- ...
    He used a conditional move, his code is complete. Multiple execution paths and prediction are obviously still there.

  13. #73
    Deleted
    shift+9 binds lul

  14. #74
    Quote Originally Posted by Cracked View Post
    But there is no branch. At the end of that code you have the result of the eps comparison in the sign flag. So I rest my case.
    The more I read your code, the more it fails. I just don't know how can you implement a fucking epsilon comparison without branches...Anyway, you used cmovl wrong, cmp should be before cmovl (the missing branch I was talking about) and it should compare diff with epsilon.

    My dear, cmovl is a branching instruction.

    /facepalm

    - - - Updated - - -

    Quote Originally Posted by rda View Post
    He used a conditional move, his code is complete. Multiple execution paths and prediction are obviously still there.
    His code is plain wrong. The cmovl requires a preceding comparison and is branching instruction. Do not put your faith in random guy on Internet just because they write something that looks like assembly.

    Look, I am not saying it will never pipelined. It may be but I know for a fact that these things are not that predictable. This case might trivial, I don't know. I am not an expert assembly programmer but what I do know is that branches are problematic and may mean overhead. I mean I even claimed the overhead it introduces might be trivial.
    Last edited by Kuntantee; 2016-05-18 at 02:14 PM.

  15. #75
    Quote Originally Posted by Kuntantee View Post
    The more I read your code, the more it fails. I just don't know how can you implement a fucking epsilon comparison without branches...Anyway, you used cmovl wrong, cmp should be before cmovl and it should compare diff with epsilon. You write some gibberish code and challenge me...
    I want this to end in a friendly way for some reason, so - he computed the difference and is comparing it to zero, that's equivalent to comparing the two numbers between each other (then he compares the absolute value of the difference to epsilon, and if that's higher, the result is that the numbers are not equal in the epsilon-sense).

    PS: Flags are set by neg.
    Last edited by rda; 2016-05-18 at 02:15 PM.

  16. #76
    Will Blizz get their heads out of their asses and just proceed with a fucking item squish... these numbers are getting retarded to say the least.

  17. #77
    Quote Originally Posted by BeerWolf View Post
    Will Blizz get their heads out of their asses and just proceed with a fucking item squish... these numbers are getting retarded to say the least.
    Nah, they got heads out of their asses and finally addressed source of the issue, instead of working around the problem.

  18. #78
    Quote Originally Posted by rda View Post
    I want this to end in a friendly way for some reason, so - he computed the difference and is comparing it to zero, that's equivalent to comparing the two numbers between each other (then he compares the absolute value of the difference to epsilon, and if that's higher, the result is that the numbers are not equal in the epsilon-sense).

    PS: Flags are set by neg.
    Neg doesn't set any flag. What is the default flag for cmovl?

  19. #79
    Quote Originally Posted by Zephostopkek View Post
    shift+9 binds lul
    Not everybody plays with standard mouse and keyboards:


    and others allow to use even high numbers without much trouble.
    Atoms are liars, they make up everything!

  20. #80
    Quote Originally Posted by Kuntantee View Post
    Neg doesn't set any flag. What is the default flag for cmovl?
    Neg sets AF, CF, OF, PF, SF, ZF, cmovl uses SF and OF (IIRC). He'd need different instructions for floating-point, but the idea is clear.

    Anyway, we are boring everyone, let's let the thread continue.

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •