Page 2 of 2 FirstFirst 12
Results 16 to 22 of 22

Thread: Data Mining: How It Works

  1. #16
    Retired Member
    Join Date
    10th June 2015
    Posts
    1,009
    Thanks
    2,129
    Thanked 3,244 Times in 922 Posts
    Quote Originally posted by sandy View Post
    :spinning: I'm lost !!
    My contemplation is how to make 10, 100 or maybe 10,000 computers work together in a "divide and conquer" type strategy, using free software, to solve a complex or difficult problem.

    A mainframe is a single powerful computer; the problem IBM solves is how to use all of the computational capacity in an interesting way. Many of the technological solutions for the "divide and conquer" strategy exist (LXC, virtualization etc etc) as free software in Linux and used by IBM in their offering.

    Clustering permits a single computer to become part of a coordination defining a larger and more powerful computer system. I'm examining free software solutions that can unite smaller computers so they may be used as one much more powerful computer.

    What is scary to remember, besides all of this stuff being free, is a trend we know of as Moores Law. It is an observation/thesis made by an early computer hardware engineer working for Intel, who stated that every 18 months, the power (computational capacity) of a computer doubles, this was in the 1960's or 70's (Aragorn certainly will know which year exactly ).

    And now in 2015 we are talking about freely available software that can unit these unit of computational capacity, which has been doubling for the last 40 years every 18 months, so that 10,000 of them work on a single task.

    That task may be modeling nuclear events in the study of physics, fluid dynamic calculations for F1 racing, or statistical analysis of social media trends for a market advantage, or whatever else that by nature can be described by the model of complexity.

    That is world we live in.
    Last edited by lcam88, 2nd September 2015 at 13:40.

  2. The Following 3 Users Say Thank You to lcam88 For This Useful Post:

    Aragorn (2nd September 2015), bsbray (2nd September 2015), sandy (3rd September 2015)

  3. #17
    Retired Member
    Join Date
    22nd September 2013
    Posts
    1,141
    Thanks
    15,854
    Thanked 7,406 Times in 1,137 Posts
    Well I do think that is a wonderful concept and goal Icam88. It would unite peoples/communities versus corporations and would help offset the all prevailing surveillance, control grid and ensuing trans humanistic bent the world seems to be travelling toward.

    I just wished I had a better understanding and with that more self trust to download Linux and replace Microsoft but too scary for basically a self taught, peck around, illiterate such as I .

    Here's the funny thing I managed a training company (pilot project funded by employment insurance in 97') that facilitated a grueling course to become a Microsoft System Engineer. I did not know how to turn on a computer. The Instructors were great and gave me a clone computer with written instructions on emailing and searching the web so that i would stop saying, oh; just print it out for me and skip email as I'm a hard copy lady. They caught on to my denial and sent me home with the afore mentioned computer and said have fun and should I screw things up, not to worry, just bring the clone in and they would get the students to fix it............thus that is how I learned the little I know.

    The one thing I remember they use to say often is DOS is the Boss and Windows is the manager and that they could fix most anything when they went to the DOS.........gave me confidence and soon I was sending and receiving email and surfing the net.

    Today, I would be lost without this means of contact, entertainment and education so I am ever grateful even for the little I know.
    Last edited by sandy, 3rd September 2015 at 02:47. Reason: grammar

  4. The Following 3 Users Say Thank You to sandy For This Useful Post:

    Aragorn (3rd September 2015), bsbray (3rd September 2015), lcam88 (3rd September 2015)

  5. #18
    Retired Member
    Join Date
    10th June 2015
    Posts
    1,009
    Thanks
    2,129
    Thanked 3,244 Times in 922 Posts
    Sandy,

    There are linux distributions you can download to a pendrive or burn to a DVD which permit you to try it out without interfering with windows.

    Most of the software to do clustering is already freely available, but some have use limitations which makes them practical only for very specific tasks... The main obstacle is actually "finding" the hardware.

    Aragorn,

    Perhaps there is a different way besides clustering in the traditional sense.

    http://www.mersenne.org/primes/

    The organization above requires large computational capacity to be able to find large prime numbers defined as 2^n -1 where n is some number of bits. Number 2 on the list above for example, 2^3 - 1 = 7 represented as 111 in binary notation.

    People interested in participating can download and install a bit of code that can be run in spare CPU cycles. The program connects to a web service and receives a value N and then verifies whether or not 2^N - 1 is prime or not.

    Maybe a compute cluster type system can be defined in a similar way? Sort of like torrent but with a focus on CPU cycles available to do do something?

  6. The Following 2 Users Say Thank You to lcam88 For This Useful Post:

    Aragorn (3rd September 2015), sandy (3rd September 2015)

  7. #19
    Administrator Aragorn's Avatar
    Join Date
    17th March 2015
    Location
    Middle-Earth
    Posts
    20,303
    Thanks
    88,694
    Thanked 81,135 Times in 20,318 Posts
    Quote Originally posted by lcam88 View Post
    Aragorn,

    Perhaps there is a different way besides clustering in the traditional sense.

    http://www.mersenne.org/primes/

    The organization above requires large computational capacity to be able to find large prime numbers defined as 2^n -1 where n is some number of bits. Number 2 on the list above for example, 2^3 - 1 = 7 represented as 111 in binary notation.

    People interested in participating can download and install a bit of code that can be run in spare CPU cycles. The program connects to a web service and receives a value N and then verifies whether or not 2^N - 1 is prime or not.

    Maybe a compute cluster type system can be defined in a similar way? Sort of like torrent but with a focus on CPU cycles available to do do something?
    Actually, nVidia's CUDA technology already provides for a small-scale type of clustering by using the GPU in the graphics adapter card for general purpose floating point operations. And in addition to that, a team of Belgian university students built a supercomputer a few years go, housed in a single full-tower chassis, but with multiple graphics adapter cards, each of which was used as an independent node in the cluster.

    One thing that's extremely suitable for this, is the ARM architecture, and particularly something like the Raspberry Pi. They're small, they're inexpensive, and they're easy to come by. There are videos on YouTube of guys who've built such a supercomputer, housed in a normal desktop tower chassis. Of course, ARM is not x86, so you'd need a different binary installation, or you'd have to build everything from scratch using Gentoo or LFS.

    You know, IBM, Toshiba and Sony developed an ideal clustering processor somewhere in the past decade. It was called the Cell processor, and it was comprised of a complete IBM POWER RISC core in conjunction with -- I believe -- 8 additional floating point cores, each of those 8 cores having a 16 MiB cache, so that the chip itself had 128 MiB of RAM on board as primary cache. However, for some reason, the design never really got off the ground, except in Sony's Playstation 3.
    = DEATH BEFORE DISHONOR =

  8. The Following User Says Thank You to Aragorn For This Useful Post:

    lcam88 (16th October 2015)

  9. #20
    Retired Member
    Join Date
    10th June 2015
    Posts
    1,009
    Thanks
    2,129
    Thanked 3,244 Times in 922 Posts
    If I remember correctly...

    The shader units in GPU's are not quite as precise as the FPU on a CPU, but they are sufficient to calculate for shading pixels. In that way they are very specialized.

    And in general GPUs operate with a large amount of parallelism. They perform many similar operations that are not so dependent on each other at the same time, such as pixel shading, but that specialization of task makes them rather poor when performing them more general tasks a CPU typically performs. That just means the resulting computing capacity offered has a narrower scope of application.

    I did read about the Cell processor, Intel also developed something similar. The PS3 chip is actually 8 (two disabled) cpu cores packed on the die, they have a type of "token ring" style bus to move data from one core to the next... The idea was to create a type of process pipeline where the output of one core serves as input to one or more other cores.

    The specialization of this type of architecture detracts from general use where workloads don't conform well to the type of problems that inspires the initial design. In a way, that is the same problem the GPU based systems have.

    System on chip is actually very interesting, ARM is an interesting architecture / instruction set. x86 uses some of the strategies ARM does but to a lesser extent; they both use microcode (hardware instructions that implement a "native" machine instruction). ARM has logic blocks that are more general purpose which would then be used to implement these micro-ops, X86 has logic blocks that are more specialized, that view is the basis of the idea that ARM would be more efficient, because less chip area would need to be powered on but idle than the X86 counterpart.

    In reality, there are other aspects of chip design that weigh into efficiency. Production capabilities and transistor size etc etc.

  10. The Following User Says Thank You to lcam88 For This Useful Post:

    Aragorn (3rd September 2015)

  11. #21
    Retired Member
    Join Date
    10th June 2015
    Posts
    1,009
    Thanks
    2,129
    Thanked 3,244 Times in 922 Posts
    An interesting thread on Slashdot about how NSA breaks standard asymmetric encryption.

    http://it.slashdot.org/story/15/10/1...so-much-crypto

    EDIT

    Perhaps this is actually for the other thread on computer security. Pardon me for the error.

  12. The Following User Says Thank You to lcam88 For This Useful Post:

    Aragorn (16th October 2015)

  13. #22
    Retired Member United States
    Join Date
    2nd December 2015
    Location
    American Southwest (currently)
    Posts
    2,602
    Thanks
    12,814
    Thanked 13,156 Times in 2,620 Posts
    Quote Originally posted by bsbray View Post
    Hmmm...

    NSA
    MI5/6
    Mossad

    Am I getting close? :P

    I'd imagine the NSA probably already has something equivalent or even more sophisticated. They don't seem to have to wait for public technology, and they're already tapping everything directly from ISP's, and trading foreign governments for whatever else they can't get.
    One does not have to be large to be able to do Machine Learning techniques effectively. We do "big data" modeling on essentially desk top computers you can buy in the computer store. $2-3K sets you up for most problems. Now, certain problems (like weather forecasting) are very large; but most "big data" issues can be done by partitioning the data by proper data set sampling to solve it "at home". BUT! you must have the data in the first place. I'm "told" that NSA excels at this.

  14. The Following User Says Thank You to Dumpster Diver For This Useful Post:

    Aragorn (3rd December 2015)

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •