August 2017

Thursday, 31 August 2017

angularjs - Best practice looping through a JavaScript object

I have the following JavaScript object which I need to apply parseFloat to any number value field (in order for ngTable to sort correctly).

I'm having a tough time looping through the Object to do this. I've tried a nested angular.forEach, but I have scoping issues (inner loops don't see outer variables).

What's the best manner to approach this?

The Object names (i.e: Person and PersonDetails) are dynamic. :/

My object:

{
    "data": [
        {
            "Person": {
                "id" : "1",
                "age": "23",

                "days": "5",
                "first_name": "Joe",
                "last_name": "Smith",
            },
            "PersonDetails": {
                "id": "4",
                "name": "Cousin",
                "oldest: "2",
            }
        },

        {
            "Person": {
                "id" : "2",
                "age": "18",
                "days": "3",
                "first_name": "John",
                "last_name": "Doe",
            },
            "PersonDetails": {
                "id": "4",

                "name": "Second Cousin",
                "oldest: "3",
            }
        }
        ...
        ...
    ]
};

Answer

You can do a test like this:

function representsNumber(str) {
    return str === (+str).toString();
}

// E.g. usage
representsNumber('a'); // false
representsNumber([]); // false
representsNumber(1); // false (it IS a number)

representsNumber('1.5'); // true
representsNumber('-5.1'); // true
representsNumber('NaN'); // true

And recurse over all your nodes. Overkill example:

function seeker(o, test, _true, _false) {
    _true || (_true = function (e) {return e;});
    _false || (_false = function (e) {return e;});

    function recursor(o) {
        var k;
        if (o instanceof Array)
            for (k = 0; k < o.length; ++k) // Iterate over an array
                if (typeof o[k] !== 'object')
                    o[k] = test(o[k]) ? _true(o[k]) : _false(o[k]);
                else
                    recursor(o[k]);
        else
            for (k in o) // Iterate over an object

                if (typeof o[k] !== 'object')
                    o[k] = test(o[k]) ? _true(o[k]) : _false(o[k]);
                else
                    recursor(o[k]);
    }
    if (typeof o === "object") 
        return recursor(o), o;
    else 
        return test(o) ? _true(o) : _false(o); // Not an object, just transform
}


// Sample usage
seeker({foo: [{bar: "20"}]}, representsNumber, parseFloat);
// {foo: [{bar: 20}]}

html - PHP - bad encoded turkish characters in MySQL database

Answer

I am working on a turkish website, which has stored many malformed turkish characters in a MySQL database, like:

 - ş as þ
 - ı as ý
 - ğ as ð
 - Ý as İ

i can not change the data in the database, because the database are updated daily and the new data will contain the malformed characters again. So my idea was to change the data in PHP instead of changing the data in the database. I have tried some steps:

Turkish characters are not displayed correctly

Fix Turkish Charset Issue Html / PHP (iconv?)

PHP Turkish Language displaying issue

PHP MYSQL encoding issue ( Turkish Characters )

I am using the PHP-MySQLi-Database-Class available on GitHub with utf8 as charset.

I have even tried to replace the malformed characters with str_replace, like:

$newString = str_replace ( chr ( 253 ), "ı", $newString );

My question is, how can i solve the issue without changing the characters in the database? Are there any best practices? Is it a good option just to replace the characters?

EDIT:

solved it by using

javascript - async.each not iterating when using promises

I am trying to run an asynchronous loop async.each over an array of objects.
On each object in the array, I am trying to run two functions sequentially (using promises). The problem is that async.each only runs for the first keyword.

In the following code, getKeywords loads some keywords from a file, then returns an array of keyword objects. Each keyword object is put into searchKeyword that makes a search. The search result is then put into a database using InsertSearchResults.

In my mind, each keyword should be processed in parallel and the search and insert functions are linked.

getKeywords(keys).then(function(keywords) {
    async.each(keywords, function(keywordObject, callback) {
        searchKeyword(keywordObject).then(function(searchResults) {
            return insertSearchResults(searchResults, db, collections);
        }).then(function(result) {
            console.log("here");
            callback();
        })
    })

})

c# - Check if datagridview cell is null or empty

I have to change the background color of the cells , when their value is string or empty, this is the code i have write similar to other codes here :

for (int rowIndex = 0; rowIndex < dataGridView1.RowCount; rowIndex++)
            {
             string conte = dataGridView1.Rows[rowIndex].Cells[7].Value.ToString()  ;
                if (string.IsNullOrEmpty(conte))
                {
                // dataGridView1.Rows[rowIndex].Cells[7].Style.BackColor = Color.Orange;
                }
                else
                 { dataGridView1.Rows[rowIndex].Cells[7].Style.BackColor = Color.Orange; }

        }

the dataset is complete , show me the datagridview populated and the show me this error: enter image description here

How can i fix this?? Have another way to write the code?

Why do I get garbage values when print arrays in Java?

Isn't garbage. Is the default implementation of toString for this class.

The easiest way to get the elements printed on a readable format is:

String printed = Arrays.toString(x);

Documentation for Arrays.toString(int[])

The same approach can be used to print out array of primitives and Objects.

Is it common to see directors acting in their own films?

There are a few movies I can note where their directors also played a role in their films. Orson Welles tended to star in many of his films, but that was during the 40's. Roman Polanski played the role of the man who cut Jack Nicholson's nose in Chinatown, but Chinatown was released in 1974. The most recent film I can think of where this happened was when Quentin Tarrantinto played Jimmy in Pulp Fiction, which was released in 1994.

I can't really think of any recent movies where directors did this. Is it common to see directors acting in their own films these days?

Answer

It is a very common occurrence. To quote a recent instance, Jon Favreau played the chauffeur in both Iron Man and Iron Man 2. Quentin Tarantino acted in Grindhouse, From Dusk Til Dawn, and had several other films. Clint Eastwood appeared in Gran Torino and Million Dollar Baby. I could cite even more if I could recollect year by year.

This Wikipedia article could be of some help.

bash - How to pipe stderr, and not stdout?

I have a program that writes information to stdout and stderr, and I need to grep through what's coming to stderr, while disregarding stdout.

I can of course do it in 2 steps:

command > /dev/null 2> temp.file
grep 'something' temp.file

but I would prefer to be able to do this without temp files. Are there any smart piping tricks?

Answer

First redirect stderr to stdout — the pipe; then redirect stdout to /dev/null (without changing where stderr is going):

command 2>&1 >/dev/null | grep 'something'

For the details of I/O redirection in all its variety, see the chapter on Redirections in the Bash reference manual.

Note that the sequence of I/O redirections is interpreted left-to-right, but pipes are set up before the I/O redirections are interpreted. File descriptors such as 1 and 2 are references to open file descriptions. The operation 2>&1 makes file descriptor 2 aka stderr refer to the same open file description as file descriptor 1 aka stdout is currently referring to (see dup2() and open()). The operation >/dev/null then changes file descriptor 1 so that it refers to an open file description for /dev/null, but that doesn't change the fact that file descriptor 2 refers to the open file description which file descriptor 1 was originally pointing to — namely, the pipe.

bash - How to make the hardware beep sound in Mac OS X 10.6

I just want that Mac OS X 10.6 does a hardware beep sound like in open suse and other distributions. I tried following approaches

Terminal -> beep = -bash: beep: command not found

Terminal -> say beep = voice speaks out beep (Not a Hardware beep but awesome ;) )

applescript -> beep = Macintosh bell (I want a Hardware beep!)

Does anybody know how to make the Hardware beep in bin/bash or applescript?

Answer

There is no "hardware beep" in macOS.

The functionality you're thinking of is an artifact of very old (pre-1990s) IBM PC-compatible hardware. Before most computers had sound cards, most machines had a small speaker or piezo buzzer connected to one of the channels of a timer chip. This could be used to generate simple tones or beeps. Even after many computers integrated sound cards, it remained common for quite some time for computers to route this output to a separate internal speaker. More recently, many computers, especially laptops, have integrated this functionality into the onboard sound card.

(If you're curious about the technical details of how the PC speaker interface worked, there are more details here.)

This hardware has never existed in Apple computers. The only audio output available is through the sound card, and the only system beep in macOS is the user's alert sound.

python - How do I pass a variable by reference?

The Python documentation seems unclear about whether parameters are passed by reference or value, and the following code produces the unchanged value 'Original'

class PassByReference:
    def __init__(self):
        self.variable = 'Original'
        self.change(self.variable)
        print(self.variable)


    def change(self, var):
        var = 'Changed'

Is there something I can do to pass the variable by actual reference?

Answer

Arguments are passed by assignment. The rationale behind this is twofold:

the parameter passed in is actually a reference to an object (but the reference is passed by value)

some data types are mutable, but others aren't

So:

If you pass a mutable object into a method, the method gets a reference to that same object and you can mutate it to your heart's delight, but if you rebind the reference in the method, the outer scope will know nothing about it, and after you're done, the outer reference will still point at the original object.

If you pass an immutable object to a method, you still can't rebind the outer reference, and you can't even mutate the object.

To make it even more clear, let's have some examples.

List - a mutable type

Let's try to modify the list that was passed to a method:

def try_to_change_list_contents(the_list):
    print('got', the_list)
    the_list.append('four')
    print('changed to', the_list)


outer_list = ['one', 'two', 'three']

print('before, outer_list =', outer_list)
try_to_change_list_contents(outer_list)
print('after, outer_list =', outer_list)

Output:

before, outer_list = ['one', 'two', 'three']
got ['one', 'two', 'three']
changed to ['one', 'two', 'three', 'four']
after, outer_list = ['one', 'two', 'three', 'four']

Since the parameter passed in is a reference to outer_list, not a copy of it, we can use the mutating list methods to change it and have the changes reflected in the outer scope.

Now let's see what happens when we try to change the reference that was passed in as a parameter:

def try_to_change_list_reference(the_list):
    print('got', the_list)
    the_list = ['and', 'we', 'can', 'not', 'lie']
    print('set to', the_list)

outer_list = ['we', 'like', 'proper', 'English']

print('before, outer_list =', outer_list)
try_to_change_list_reference(outer_list)
print('after, outer_list =', outer_list)

Output:

before, outer_list = ['we', 'like', 'proper', 'English']
got ['we', 'like', 'proper', 'English']
set to ['and', 'we', 'can', 'not', 'lie']
after, outer_list = ['we', 'like', 'proper', 'English']

Since the the_list parameter was passed by value, assigning a new list to it had no effect that the code outside the method could see. The the_list was a copy of the outer_list reference, and we had the_list point to a new list, but there was no way to change where outer_list pointed.

String - an immutable type

It's immutable, so there's nothing we can do to change the contents of the string

Now, let's try to change the reference

def try_to_change_string_reference(the_string):
    print('got', the_string)

    the_string = 'In a kingdom by the sea'
    print('set to', the_string)

outer_string = 'It was many and many a year ago'

print('before, outer_string =', outer_string)
try_to_change_string_reference(outer_string)
print('after, outer_string =', outer_string)

Output:

before, outer_string = It was many and many a year ago
got It was many and many a year ago
set to In a kingdom by the sea
after, outer_string = It was many and many a year ago

Again, since the the_string parameter was passed by value, assigning a new string to it had no effect that the code outside the method could see. The the_string was a copy of the outer_string reference, and we had the_string point to a new string, but there was no way to change where outer_string pointed.

I hope this clears things up a little.

EDIT: It's been noted that this doesn't answer the question that @David originally asked, "Is there something I can do to pass the variable by actual reference?". Let's work on that.

How do we get around this?

As @Andrea's answer shows, you could return the new value. This doesn't change the way things are passed in, but does let you get the information you want back out:

def return_a_whole_new_string(the_string):
    new_string = something_to_do_with_the_old_string(the_string)

    return new_string

# then you could call it like
my_string = return_a_whole_new_string(my_string)

If you really wanted to avoid using a return value, you could create a class to hold your value and pass it into the function or use an existing class, like a list:

def use_a_wrapper_to_simulate_pass_by_reference(stuff_to_change):
    new_string = something_to_do_with_the_old_string(stuff_to_change[0])

    stuff_to_change[0] = new_string

# then you could call it like
wrapper = [my_string]
use_a_wrapper_to_simulate_pass_by_reference(wrapper)

do_something_with(wrapper[0])

Although this seems a little cumbersome.

c++ - How to use function with template, When it declare in .hpp file and implementation in .cpp file

I want to declare function, with template value, in header file. And implementation it in cpp file and call the function from main.
But i want to do it without using class.
How do i do it?

Thanks

temp.hpp :

#ifndef TEMP_HPP
#define TEMP_HPP
template
void show( T a);        
#endif /* TEMP_HPP */

temp.cpp :

#include 
#include "temp.hpp"
template
void show( T a) 
{ 
    std::cout << "a:" << a << std::endl;
}

main :

#include 
#include "TextTable.h"
#include "temp.hpp"

int main()
{

    int a = 5;
    float b = 2.5;
    show(a); 
    return 0;
}

php - How do I can access an outside function varible, inside a function in javascript?

I've a percent variable in my javascript, that variables is passing result from PHP.

This is the javascript:

test.js

console.log(percent); //* //variable was passed from PHP

function displayLoading() {
    console.log(percent); //**
}

If I use *console.log(percent) out the function it will print_out the value of percent in console. But if I use **console.log(percent) inside the displayLoading function, it will print_out as undefined.

How I can access the outside variable inside a function?

I've tried this way

from stackoverflow

var funcOne = function() {
    this.sharedVal = percent;
};

var funcTwo = function() {
    console.log(funcOne.sharedVal);
};

and give print_out undefined into console log.

and

from stackoverflow

var per = percents
console.log(per);   //this line print_out the value into console log

function displayLoading() {
   console.log(per);    //this print_out "undefined" into console log.
   var myPercent = per;
   console.log(per);    //and also print_out "undefined" into console log.
}

Both of code above didn't work for me, any one know another way?
Any help is appreciated, thanks :)

EDITED:

The percents inside javascript above, I get from this code:

headerController.php

   $percent = $percent + 10;
?>

The main problem has found, the reason why I got undefined is because I print_out the percents right before the variable has passed from php.

How do I can pass php variable directly into a function in javascript file (test.js in this case)?

PHP variables to array

I often see or have to convert a bunch of variables into an array like this:

$array = array("description"=>$description, "title"=>$title, "page"=>$page, "author"=>$author);

Basically, all array keys match the name of the variable that is being passed in. Is there a way to reference a variable name so that it can be passed into the array like so:

$array[varName($description)] = $description;

groovy - Regex capture for multiple occurrences and place them in groups

I have the following line where the string "winline" may occur 1 or more times (or none) and I do not know in advance how many times it would appear in the line.

Is there a way I can capture all 'winline' that occurs in this text? I am using Groovy and tried just matching the winline and it does capture all but each is stated as group 1. I want to be able to capture them group by group.

Example using this regex on following line: winline\":([0-9]+)

def matcher
def winningSym = /winline\":([0-9]+)/


if((matcher = line =~ winningSym)){
    println matcher[0][1] // get 5 which is right
    println matcher[1][1] // expect 4 but get IndexOutOfBounds Exception
}

Line:

{"Id":1,"winline":5,"Winnings":50000, some random text, "winline":4,

more random text, "winline":7, more stuff}

Answer

You may slightly modify the regex to use a positive lookbehind and use a simpler code:

def winningSym = /(?<=winline":)[0-9]+/
String s = """{"Id":1,"winline":5,"Winnings":50000, some random text, "winline":4, more random text, "winline":7, more stuff}"""
def res = s.findAll(winningSym)
println(res)

See the Groovy demo, output: [5, 4, 7].

To use your regex and collect Group 1 values use .collect on the matcher (as Matcher supports the iterator() method):

def winningSym = /winline":([0-9]+)/
String line = """{"Id":1,"winline":5,"Winnings":50000, some random text, "winline":4, more random text, "winline":7, more stuff}"""
def res = (line =~ winningSym).collect { it[1] }

See another Groovy demo. Here, it[1] will access the contents inside capturing group 1 and .collect will iterate through all matches.

C/C++ floating point issue

I am struggling with a basic floating-point precision issue. Here is the problem:

double d = 0.1;
d += 0.1;

d += 0.1;

d == 0.3 ? std::cout << "yes" : std::cout << "no";

Run the code and you get "no"

I understand that C/C++ store values in binary and that binary storage can not exactly store every value. I also understand that these small errors compound as you do various math operations on them (i.e. d += 0.1;).

My questions is if I do need to test if d == 0.3 (to a reasonable precision.. as is the clear intent of code above)... how do I do that? I hope the answer is not:

if (d > 0.2999 && d < 0.3001) ...

ALSO.. this works

float f = 0.1;
f += 0.1;
f += 0.1;


f == 0.3f ? std::cout << "yes" : std::cout << "no";

but I can find no equivalent "0.3d" in the language.

Thanks

multiple redirects with .htaccess

I want to do this

website.com/blog/index.php -> website.com/blog

website.com/admin/archief_login.php -> website.com/admin

this works with my code.
but I want to add this:

website.com/aa -> website.com/web/index.php?init=aa

for some reason the blog gets this redirect: website.com/blog/?init=blog

what is the best way to set these different rewrites?

RewriteEngine on Options All -Indexes
RewriteCond %{HTTP_HOST}
^websit.com$ [OR] RewriteCond
%{HTTP_HOST} ^www.website.com$
RewriteRule ^admin$
"http\:\/\/www.website.com\/admin/archief_login.php"
[R=301,L]

RewriteRule ^blog$
"http\:\/\/www.website.com\/blog/index.php"
[R=301,L]

DirectoryIndex client_login.php

RewriteRule
^screen-([a-zA-Z0-9_-]+).html$
index_client.php?screen=$1

RewriteRule
^invoice([a-zA-Z0-9_-]+).html$
make_invoice.php?id=$1

RewriteRule
^pack-([a-zA-Z0-9_-]+).html$
index_client.php?screen=pack_code&wwwcode=$1

Answer

You need to put the more "general" rules lower in the file so they don't match almost all of your URLs

RewriteRule ^(\w)$ /web/index.php?init=$1 [L, NC]
RewriteRule ^blog$ /blog/index.php [R=301,L]

The above will do

website.com/aa => website.com/web/index.php?init=aa
website.com/blog => website.com/web/index.php?init=blog

If you reverse the two rules you will get

website.com/aa => website.com/web/index.php?init=aa
website.com/blog => website.com/blog/index.php

What does the "&" sign mean in PHP?

I was trying to find this answer on Google, but I guess the symbol & works as some operator, or is just not generally a searchable term for any reason.. anyhow. I saw this code snippet while learning how to create WordPress plugins, so I just need to know what the & means when it precedes a variable that holds a class object.

//Actions and Filters
if (isset($dl_pluginSeries)) {

    //Actions
    add_action('wp_head', array(&$dl_pluginSeries, 'addHeaderCode'), 1);

    //Filters
    add_filter('the_content', array(&$dl_pluginSeries, 'addContent'));
}

Answer

This will force the variable to be passed by reference. Normally, a hard copy would be created for simple types. This can come handy for large strings (performance gain) or if you want to manipulate the variable without using the return statement, eg:

$a = 1;

function inc(&$input)

{
   $input++;
}

inc($a);

echo $a; // 2

Objects will be passed by reference automatically.

If you like to handle a copy over to a function, use

clone $object;

Then, the original object is not altered, eg:

$a = new Obj;
$a->prop = 1;

$b = clone $a;
$b->prop = 2; // $a->prop remains at 1

c++ - Why does changing 0.1f to 0 slow down performance by 10x?

Why does this bit of code,

const float x[16] = {  1.1,   1.2,   1.3,     1.4,   1.5,   1.6,   1.7,   1.8,
                       1.9,   2.0,   2.1,     2.2,   2.3,   2.4,   2.5,   2.6};
const float z[16] = {1.123, 1.234, 1.345, 156.467, 1.578, 1.689, 1.790, 1.812,
                     1.923, 2.034, 2.145,   2.256, 2.367, 2.478, 2.589, 2.690};
float y[16];
for (int i = 0; i < 16; i++)
{
    y[i] = x[i];

}

for (int j = 0; j < 9000000; j++)
{
    for (int i = 0; i < 16; i++)
    {
        y[i] *= x[i];
        y[i] /= z[i];
        y[i] = y[i] + 0.1f; // <--
        y[i] = y[i] - 0.1f; // <--

    }
}

run more than 10 times faster than the following bit (identical except where noted)?

const float x[16] = {  1.1,   1.2,   1.3,     1.4,   1.5,   1.6,   1.7,   1.8,
                       1.9,   2.0,   2.1,     2.2,   2.3,   2.4,   2.5,   2.6};
const float z[16] = {1.123, 1.234, 1.345, 156.467, 1.578, 1.689, 1.790, 1.812,
                     1.923, 2.034, 2.145,   2.256, 2.367, 2.478, 2.589, 2.690};

float y[16];
for (int i = 0; i < 16; i++)
{
    y[i] = x[i];
}

for (int j = 0; j < 9000000; j++)
{
    for (int i = 0; i < 16; i++)
    {

        y[i] *= x[i];
        y[i] /= z[i];
        y[i] = y[i] + 0; // <--
        y[i] = y[i] - 0; // <--
    }
}

when compiling with Visual Studio 2010 SP1.
The optimization level was -02 with sse2 enabled.

I haven't tested with other compilers.

Answer

Welcome to the world of denormalized floating-point! They can wreak havoc on performance!!!

Denormal (or subnormal) numbers are kind of a hack to get some extra values very close to zero out of the floating point representation. Operations on denormalized floating-point can be tens to hundreds of times slower than on normalized floating-point. This is because many processors can't handle them directly and must trap and resolve them using microcode.

If you print out the numbers after 10,000 iterations, you will see that they have converged to different values depending on whether 0 or 0.1 is used.

Here's the test code compiled on x64:

int main() {

    double start = omp_get_wtime();

    const float x[16]={1.1,1.2,1.3,1.4,1.5,1.6,1.7,1.8,1.9,2.0,2.1,2.2,2.3,2.4,2.5,2.6};
    const float z[16]={1.123,1.234,1.345,156.467,1.578,1.689,1.790,1.812,1.923,2.034,2.145,2.256,2.367,2.478,2.589,2.690};
    float y[16];
    for(int i=0;i<16;i++)
    {
        y[i]=x[i];

    }
    for(int j=0;j<9000000;j++)
    {
        for(int i=0;i<16;i++)
        {
            y[i]*=x[i];
            y[i]/=z[i];
#ifdef FLOATING
            y[i]=y[i]+0.1f;
            y[i]=y[i]-0.1f;

#else
            y[i]=y[i]+0;
            y[i]=y[i]-0;
#endif

            if (j > 10000)
                cout << y[i] << "  ";
        }
        if (j > 10000)
            cout << endl;

    }

    double end = omp_get_wtime();
    cout << end - start << endl;

    system("pause");
    return 0;
}

Output:

#define FLOATING
1.78814e-007  1.3411e-007  1.04308e-007  0  7.45058e-008  6.70552e-008  6.70552e-008  5.58794e-007  3.05474e-007  2.16067e-007  1.71363e-007  1.49012e-007  1.2666e-007  1.11759e-007  1.04308e-007  1.04308e-007
1.78814e-007  1.3411e-007  1.04308e-007  0  7.45058e-008  6.70552e-008  6.70552e-008  5.58794e-007  3.05474e-007  2.16067e-007  1.71363e-007  1.49012e-007  1.2666e-007  1.11759e-007  1.04308e-007  1.04308e-007

//#define FLOATING
6.30584e-044  3.92364e-044  3.08286e-044  0  1.82169e-044  1.54143e-044  2.10195e-044  2.46842e-029  7.56701e-044  4.06377e-044  3.92364e-044  3.22299e-044  3.08286e-044  2.66247e-044  2.66247e-044  2.24208e-044
6.30584e-044  3.92364e-044  3.08286e-044  0  1.82169e-044  1.54143e-044  2.10195e-044  2.45208e-029  7.56701e-044  4.06377e-044  3.92364e-044  3.22299e-044  3.08286e-044  2.66247e-044  2.66247e-044  2.24208e-044

Note how in the second run the numbers are very close to zero.

Denormalized numbers are generally rare and thus most processors don't try to handle them efficiently.

To demonstrate that this has everything to do with denormalized numbers, if we flush denormals to zero by adding this to the start of the code:

_MM_SET_FLUSH_ZERO_MODE(_MM_FLUSH_ZERO_ON);

Then the version with 0 is no longer 10x slower and actually becomes faster. (This requires that the code be compiled with SSE enabled.)

This means that rather than using these weird lower precision almost-zero values, we just round to zero instead.

Timings: Core i7 920 @ 3.5 GHz:

//  Don't flush denormals to zero.
0.1f: 0.564067

0   : 26.7669

//  Flush denormals to zero.
0.1f: 0.587117
0   : 0.341406

In the end, this really has nothing to do with whether it's an integer or floating-point. The 0 or 0.1f is converted/stored into a register outside of both loops. So that has no effect on performance.

Wednesday, 30 August 2017

c# - Creating a comma separated list from IList or IEnumerable

.NET 4+

IList strings = new List{"1","2","testing"};
string joined = string.Join(",", strings);

Detail & Pre .Net 4.0 Solutions

IEnumerable can be converted into a string array very easily with LINQ (.NET 3.5):

IEnumerable strings = ...;
string[] array = strings.ToArray();

It's easy enough to write the equivalent helper method if you need to:

public static T[] ToArray(IEnumerable source)
{

    return new List(source).ToArray();
}

Then call it like this:

IEnumerable strings = ...;
string[] array = Helpers.ToArray(strings);

You can then call string.Join. Of course, you don't have to use a helper method:

// C# 3 and .NET 3.5 way:
string joined = string.Join(",", strings.ToArray());
// C# 2 and .NET 2.0 way:
string joined = string.Join(",", new List(strings).ToArray());

The latter is a bit of a mouthful though :)

This is likely to be the simplest way to do it, and quite performant as well - there are other questions about exactly what the performance is like, including (but not limited to) this one.

As of .NET 4.0, there are more overloads available in string.Join, so you can actually just write:

string joined = string.Join(",", strings);

Much simpler :)

php - $function() and $$variable

What the heck is $function(); and $$variable for?

Have never heard of these before, and searching on google doesn't give anything useful (possible that my keywords aren't perfect).

via https://stackoverflow.com/questions/4891301/top-bad-practices-in-php/4891422#4891422

Answer

$function() is a variable function and $$variable is a variable variable.

Those linked pages should give you plenty to go on, or at the very least some actual words to search with.

javascript - Ensure variable is available before running filter code

I set a rootScope variable in my run block like so:

.run(['$rootScope', 'numbersService', function($rootScope, numbersService) {
  numbersService.getAvailableCodes().$promise.then(function(data) {
    $rootScope.availableCodes = data.codes;
  });
}]);

I need $rootScope.availableCodes to be available in order for my filter to run properly. Here is the beginning of my filter code:

.filter('formatValue', ['$rootScope', function($rootScope) {
  var availableCodes = $rootScope.availableCodes;
  ...

My problem is that since $rootScope.availableCodes is set based on an async call it is not guaranteed to be available when my filter runs (this filter appears in dozens of places throughout my HTML). How can I wait to run the rest of my filter logic until $rootScope.availableCodes is set?

r - Create an input variable that is dependent on another input variable in flexdashboard shiny widget

I am trying to create a user input in flexdashboard that is dependent on another user input. Example dataset: alphabet_data <- read.table(
text = "Alphabet Number

ABC 1
DEF 4
ABD 5
ABC 2
ABC 3
ABD 6
ABD 7
ABD 8",
header = TRUE,
stringsAsFactors = FALSE)

User selects "alphabet" in selectizeInput, say ABD, based on that I want the user to get the selectizeInput options for "number" to only to be shown 5,6,7,8.

I tried observeEvent on "alphabet", to create the new dependent input fresh

I created the dependent input "number" with NULL choices, and used observe event to updateselectizeInput.

I created a new table based on alphabet choice, and then used that table within reactive to create "number" input.

There's code below to reproduce the issue.

title: "Untitled"
output:
flexdashboard::flex_dashboard:
orientation: columns
vertical_layout: fill

runtime: shiny

library(flexdashboard)
library(tidyverse)

alphabet_data <- read.table(
        text = "Alphabet       Number
        ABC       1
        DEF       4
        ABD       5
        ABC       2

        ABC       3
        ABD       6
        ABD       7
        ABD       8",
        header = TRUE,
        stringsAsFactors = FALSE)

Column {.sidebar data-width=650}

Chart A


selectizeInput(
    inputId  = "alphabet",
    label    = "Alphabet",
    choices  = unique(alphabet_data$Alphabet),
    multiple = TRUE,
    options  = list(maxItems = 2)
)


selectizeInput(
        inputId  = "number",
        label    = "Number",
        choices  = NULL,
        multiple = TRUE,
        options  = list(maxItems = 2)
)

selected_alphabet <- eventReactive(

    eventExpr = input$alphabet,

    valueExpr = {
    alphabet_data %>% 
            filter(Alphabet %in% input$alphabet)
})

reactive({
    observeEvent(
        eventExpr   = input$alphabet,

        handlerExpr = {
            updateSelectizeInput(
                inputId = "number",
                choices = selected_alphabet()$number
            )
        }
    )
})

Column {data-width=350}

Chart B

output$alphabet <- renderPrint(input$alphabet)
textOutput(outputId = "alphabet")

Chart C

renderPrint(selected_alphabet())

Chart D

output$number <- renderPrint(input$number)
textOutput(outputId = "number")

I expect when the user select ABD alphabet, the options for number to showcase as 5,6,7,8.

Answer

I'm having trouble running your sample script, so I wrote a similar one.

You have two options:

Use renderUI() or insertUI() to generate UI components in server.

use updateSelectInput() to upadte UI components.

I wrote a demo in shiny, though it's not using flexdashboard, it does the same thing:

library(shiny)

ui <- fluidPage(
  fluidRow(
      tags$h1("level 1"),
      column(

          width = 6,
          selectizeInput("selectizeInput1","Input 1",choices = letters,selected = "",multiple = TRUE)
      ),
      column(
          width = 6,
          textOutput("textOutput1")
      )
  ),
  fluidRow(
      tags$h1("level 2"),

      column(
          width = 6,
          selectizeInput("selectizeInput2","Input 2",choices = "",selected = "",multiple = TRUE)
      ),
      column(
          width = 6,
          textOutput("textOutput2")
      )
  ),
  fluidRow(

      tags$h1("level 3"),
      column(
          width = 6,
          selectizeInput("selectizeInput3","Input 3",choices = "",selected = "",multiple = TRUE)
      ),
      column(
          width = 6,
          textOutput("textOutput3")
      )
  )


)

server <- function(input, output, session) {
    # level 1
    output$textOutput1 <- renderText(input$selectizeInput1)

    # level 2
    observe({
        updateSelectInput(

            session = session,
            inputId = "selectizeInput2",
            choices = input$selectizeInput1,
            selected = input$selectizeInput1
        )
        output$textOutput2 <- renderText(input$selectizeInput2)

    })



    # level 3
    observe({
        updateSelectInput(
            session = session,
            inputId = "selectizeInput3",
            choices = input$selectizeInput2,
            selected = input$selectizeInput2
        )
        output$textOutput3 <- renderText(input$selectizeInput3)


    })
}

shinyApp(ui, server)

For better understanding, you can read this article or try out this app

PHP random string generator

I'm trying to create a randomized string in PHP, and I get absolutely no output with this:

    function RandomString()
    {

        $characters = '0123456789abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ';
        $randstring = '';
        for ($i = 0; $i < 10; $i++) {
            $randstring = $characters[rand(0, strlen($characters))];
        }
        return $randstring;
    }

    RandomString();
    echo $randstring;

What am I doing wrong?

Answer

To answer this question specifically, two problems:

$randstring is not in scope when you echo it.

The characters are not getting concatenated together in the loop.

Here's a code snippet with the corrections:

function generateRandomString($length = 10) {
    $characters = '0123456789abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ';
    $charactersLength = strlen($characters);
    $randomString = '';
    for ($i = 0; $i < $length; $i++) {
        $randomString .= $characters[rand(0, $charactersLength - 1)];
    }

    return $randomString;
}

Output the random string with the call below:

// Echo the random string.
// Optionally, you can give it a desired string length.
echo generateRandomString();

Please note that this generates predictable random strings. If you want to create secure tokens, see this answer.

multithreading - Does the C++ volatile keyword introduce a memory fence?

While I was working through an online downloadable video tutorial for 3D Graphics & Game Engine development working with modern OpenGL. We did use volatile within one of our classes. The tutorial website can be found here and the video working with the volatile keyword is found in the Shader Engine series video 98. These works are not of my own but are accredited to Marek A. Krzeminski, MASc and this is an excerpt from the video download page.

And if you are subscribed to his website and have access to his video's within this video he references this article concerning the use of Volatile with multithreading programming.

volatile: The Multithreaded Programmer's Best Friend

By Andrei Alexandrescu, February 01, 2001

The volatile keyword was devised to prevent compiler optimizations that might render code incorrect in the presence of certain asynchronous events.

I don't want to spoil your mood, but this column addresses the dreaded topic of multithreaded programming. If — as the previous installment of Generic says — exception-safe programming is hard, it's child's play compared to multithreaded programming.

Programs using multiple threads are notoriously hard to write, prove correct, debug, maintain, and tame in general. Incorrect multithreaded programs might run for years without a glitch, only to unexpectedly run amok because some critical timing condition has been met.

Needless to say, a programmer writing multithreaded code needs all the help she can get. This column focuses on race conditions — a common source of trouble in multithreaded programs — and provides you with insights and tools on how to avoid them and, amazingly enough, have the compiler work hard at helping you with that.

Just a Little Keyword

Although both C and C++ Standards are conspicuously silent when it comes to threads, they do make a little concession to multithreading, in the form of the volatile keyword.

Just like its better-known counterpart const, volatile is a type modifier. It's intended to be used in conjunction with variables that are accessed and modified in different threads. Basically, without volatile, either writing multithreaded programs becomes impossible, or the compiler wastes vast optimization opportunities. An explanation is in order.

Consider the following code:

class Gadget {
public:
    void Wait() {

        while (!flag_) {
            Sleep(1000); // sleeps for 1000 milliseconds
        }
    }
    void Wakeup() {
        flag_ = true;
    }
    ...
private:
    bool flag_;

};

The purpose of Gadget::Wait above is to check the flag_ member variable every second and return when that variable has been set to true by another thread. At least that's what its programmer intended, but, alas, Wait is incorrect.

Suppose the compiler figures out that Sleep(1000) is a call into an external library that cannot possibly modify the member variable flag_. Then the compiler concludes that it can cache flag_ in a register and use that register instead of accessing the slower on-board memory. This is an excellent optimization for single-threaded code, but in this case, it harms correctness: after you call Wait for some Gadget object, although another thread calls Wakeup, Wait will loop forever. This is because the change of flag_ will not be reflected in the register that caches flag_. The optimization is too ... optimistic.

Caching variables in registers is a very valuable optimization that applies most of the time, so it would be a pity to waste it. C and C++ give you the chance to explicitly disable such caching. If you use the volatile modifier on a variable, the compiler won't cache that variable in registers — each access will hit the actual memory location of that variable. So all you have to do to make Gadget's Wait/Wakeup combo work is to qualify flag_ appropriately:

class Gadget {

public:
    ... as above ...
private:
    volatile bool flag_;
};

Most explanations of the rationale and usage of volatile stop here and advise you to volatile-qualify the primitive types that you use in multiple threads. However, there is much more you can do with volatile, because it is part of C++'s wonderful type system.

Using volatile with User-Defined Types

You can volatile-qualify not only primitive types, but also user-defined types. In that case, volatile modifies the type in a way similar to const. (You can also apply const and volatile to the same type simultaneously.)

Unlike const, volatile discriminates between primitive types and user-defined types. Namely, unlike classes, primitive types still support all of their operations (addition, multiplication, assignment, etc.) when volatile-qualified. For example, you can assign a non-volatile int to a volatile int, but you cannot assign a non-volatile object to a volatile object.

Let's illustrate how volatile works on user-defined types on an example.

class Gadget {
public:
    void Foo() volatile;

    void Bar();
    ...
private:
    String name_;
    int state_;
};
...
Gadget regularGadget;
volatile Gadget volatileGadget;

If you think volatile is not that useful with objects, prepare for some surprise.

volatileGadget.Foo(); // ok, volatile fun called for
                  // volatile object
regularGadget.Foo();  // ok, volatile fun called for
                  // non-volatile object
volatileGadget.Bar(); // error! Non-volatile function called for
                  // volatile object!

The conversion from a non-qualified type to its volatile counterpart is trivial. However, just as with const, you cannot make the trip back from volatile to non-qualified. You must use a cast:

Gadget& ref = const_cast(volatileGadget);
ref.Bar(); // ok

A volatile-qualified class gives access only to a subset of its interface, a subset that is under the control of the class implementer. Users can gain full access to that type's interface only by using a const_cast. In addition, just like constness, volatileness propagates from the class to its members (for example, volatileGadget.name_ and volatileGadget.state_ are volatile variables).

volatile, Critical Sections, and Race Conditions

The simplest and the most often-used synchronization device in multithreaded programs is the mutex. A mutex exposes the Acquire and Release primitives. Once you call Acquire in some thread, any other thread calling Acquire will block. Later, when that thread calls Release, precisely one thread blocked in an Acquire call will be released. In other words, for a given mutex, only one thread can get processor time in between a call to Acquire and a call to Release. The executing code between a call to Acquire and a call to Release is called a critical section. (Windows terminology is a bit confusing because it calls the mutex itself a critical section, while "mutex" is actually an inter-process mutex. It would have been nice if they were called thread mutex and process mutex.)

Mutexes are used to protect data against race conditions. By definition, a race condition occurs when the effect of more threads on data depends on how threads are scheduled. Race conditions appear when two or more threads compete for using the same data. Because threads can interrupt each other at arbitrary moments in time, data can be corrupted or misinterpreted. Consequently, changes and sometimes accesses to data must be carefully protected with critical sections. In object-oriented programming, this usually means that you store a mutex in a class as a member variable and use it whenever you access that class' state.

Experienced multithreaded programmers might have yawned reading the two paragraphs above, but their purpose is to provide an intellectual workout, because now we will link with the volatile connection. We do this by drawing a parallel between the C++ types' world and the threading semantics world.

Outside a critical section, any thread might interrupt any other at any time; there is no control, so consequently variables accessible from multiple threads are volatile. This is in keeping with the original intent of volatile — that of preventing the compiler from unwittingly caching values used by multiple threads at once.

Inside a critical section defined by a mutex, only one thread has access. Consequently, inside a critical section, the executing code has single-threaded semantics. The controlled variable is not volatile anymore — you can remove the volatile qualifier.

In short, data shared between threads is conceptually volatile outside a critical section, and non-volatile inside a critical section.

You enter a critical section by locking a mutex. You remove the volatile qualifier from a type by applying a const_cast. If we manage to put these two operations together, we create a connection between C++'s type system and an application's threading semantics. We can make the compiler check race conditions for us.

LockingPtr

We need a tool that collects a mutex acquisition and a const_cast. Let's develop a LockingPtr class template that you initialize with a volatile object obj and a mutex mtx. During its lifetime, a LockingPtr keeps mtx acquired. Also, LockingPtr offers access to the volatile-stripped obj. The access is offered in a smart pointer fashion, through operator-> and operator*. The const_cast is performed inside LockingPtr. The cast is semantically valid because LockingPtr keeps the mutex acquired for its lifetime.

First, let's define the skeleton of a class Mutex with which LockingPtr will work:

class Mutex {
public:
    void Acquire();
    void Release();
    ...    
};

To use LockingPtr, you implement Mutex using your operating system's native data structures and primitive functions.

LockingPtr is templated with the type of the controlled variable. For example, if you want to control a Widget, you use a LockingPtr that you initialize with a variable of type volatile Widget.

LockingPtr's definition is very simple. LockingPtr implements an unsophisticated smart pointer. It focuses solely on collecting a const_cast and a critical section.

template 
class LockingPtr {
public:
    // Constructors/destructors

    LockingPtr(volatile T& obj, Mutex& mtx)
      : pObj_(const_cast(&obj)), pMtx_(&mtx) {    
        mtx.Lock();    
    }
    ~LockingPtr() {    
        pMtx_->Unlock();    
    }
    // Pointer behavior
    T& operator*() {    
        return *pObj_;    

    }
    T* operator->() {   
        return pObj_;   
    }
private:
    T* pObj_;
    Mutex* pMtx_;
    LockingPtr(const LockingPtr&);
    LockingPtr& operator=(const LockingPtr&);
};

In spite of its simplicity, LockingPtr is a very useful aid in writing correct multithreaded code. You should define objects that are shared between threads as volatile and never use const_cast with them — always use LockingPtr automatic objects. Let's illustrate this with an example.

Say you have two threads that share a vector object:

class SyncBuf {
public:
    void Thread1();
    void Thread2();

private:
    typedef vector BufT;
    volatile BufT buffer_;
    Mutex mtx_; // controls access to buffer_
};

Inside a thread function, you simply use a LockingPtr to get controlled access to the buffer_ member variable:

void SyncBuf::Thread1() {

    LockingPtr lpBuf(buffer_, mtx_);
    BufT::iterator i = lpBuf->begin();
    for (; i != lpBuf->end(); ++i) {
        ... use *i ...
    }
}

The code is very easy to write and understand — whenever you need to use buffer_, you must create a LockingPtr pointing to it. Once you do that, you have access to vector's entire interface.

The nice part is that if you make a mistake, the compiler will point it out:

void SyncBuf::Thread2() {
    // Error! Cannot access 'begin' for a volatile object
    BufT::iterator i = buffer_.begin();
    // Error! Cannot access 'end' for a volatile object
    for ( ; i != lpBuf->end(); ++i ) {
        ... use *i ...
    }
}

You cannot access any function of buffer_ until you either apply a const_cast or use LockingPtr. The difference is that LockingPtr offers an ordered way of applying const_cast to volatile variables.

LockingPtr is remarkably expressive. If you only need to call one function, you can create an unnamed temporary LockingPtr object and use it directly:

unsigned int SyncBuf::Size() {
return LockingPtr(buffer_, mtx_)->size();
}

Back to Primitive Types

We saw how nicely volatile protects objects against uncontrolled access and how LockingPtr provides a simple and effective way of writing thread-safe code. Let's now return to primitive types, which are treated differently by volatile.

Let's consider an example where multiple threads share a variable of type int.

class Counter {
public:
    ...

    void Increment() { ++ctr_; }
    void Decrement() { —ctr_; }
private:
    int ctr_;
};

If Increment and Decrement are to be called from different threads, the fragment above is buggy. First, ctr_ must be volatile. Second, even a seemingly atomic operation such as ++ctr_ is actually a three-stage operation. Memory itself has no arithmetic capabilities. When incrementing a variable, the processor:

Reads that variable in a register

Increments the value in the register

Writes the result back to memory

This three-step operation is called RMW (Read-Modify-Write). During the Modify part of an RMW operation, most processors free the memory bus in order to give other processors access to the memory.

If at that time another processor performs a RMW operation on the same variable, we have a race condition: the second write overwrites the effect of the first.

To avoid that, you can rely, again, on LockingPtr:

class Counter {
public:
    ...
    void Increment() { ++*LockingPtr(ctr_, mtx_); }
    void Decrement() { —*LockingPtr(ctr_, mtx_); }
private:
    volatile int ctr_;
    Mutex mtx_;
};

Now the code is correct, but its quality is inferior when compared to SyncBuf's code. Why? Because with Counter, the compiler will not warn you if you mistakenly access ctr_ directly (without locking it). The compiler compiles ++ctr_ if ctr_ is volatile, although the generated code is simply incorrect. The compiler is not your ally anymore, and only your attention can help you avoid race conditions.

What should you do then? Simply encapsulate the primitive data that you use in higher-level structures and use volatile with those structures. Paradoxically, it's worse to use volatile directly with built-ins, in spite of the fact that initially this was the usage intent of volatile!

volatile Member Functions

So far, we've had classes that aggregate volatile data members; now let's think of designing classes that in turn will be part of larger objects and shared between threads. Here is where volatile member functions can be of great help.

When designing your class, you volatile-qualify only those member functions that are thread safe. You must assume that code from the outside will call the volatile functions from any code at any time. Don't forget: volatile equals free multithreaded code and no critical section; non-volatile equals single-threaded scenario or inside a critical section.

For example, you define a class Widget that implements an operation in two variants — a thread-safe one and a fast, unprotected one.

class Widget {
public:
    void Operation() volatile;
    void Operation();
    ...
private:

    Mutex mtx_;
};

Notice the use of overloading. Now Widget's user can invoke Operation using a uniform syntax either for volatile objects and get thread safety, or for regular objects and get speed. The user must be careful about defining the shared Widget objects as volatile.

When implementing a volatile member function, the first operation is usually to lock this with a LockingPtr. Then the work is done by using the non- volatile sibling:

void Widget::Operation() volatile {
    LockingPtr lpThis(*this, mtx_);

    lpThis->Operation(); // invokes the non-volatile function
}

Summary

When writing multithreaded programs, you can use volatile to your advantage. You must stick to the following rules:

Define all shared objects as volatile.

Don't use volatile directly with primitive types.

When defining shared classes, use volatile member functions to express thread safety.

If you do this, and if you use the simple generic component LockingPtr, you can write thread-safe code and worry much less about race conditions, because the compiler will worry for you and will diligently point out the spots where you are wrong.

A couple of projects I've been involved with use volatile and LockingPtr to great effect. The code is clean and understandable. I recall a couple of deadlocks, but I prefer deadlocks to race conditions because they are so much easier to debug. There were virtually no problems related to race conditions. But then you never know.

Acknowledgements

Many thanks to James Kanze and Sorin Jianu who helped with insightful ideas.

Andrei Alexandrescu is a Development Manager at RealNetworks Inc. (www.realnetworks.com), based in Seattle, WA, and author of the acclaimed book Modern C++ Design. He may be contacted at www.moderncppdesign.com. Andrei is also one of the featured instructors of The C++ Seminar (www.gotw.ca/cpp_seminar).

This article might be a little dated, but it does give good insight towards an excellent use of using the volatile modifier with in the use of multithreaded programming to help keep events asynchronous while having the compiler checking for race conditions for us. This may not directly answer the OPs original question about creating a memory fence, but I choose to post this as an answer for others as an excellent reference towards a good use of volatile when working with multithreaded applications.

python - "Large data" work flows using pandas

I have tried to puzzle out an answer to this question for many months while learning pandas. I use SAS for my day-to-day work and it is great for it's out-of-core support. However, SAS is horrible as a piece of software for numerous other reasons.

One day I hope to replace my use of SAS with python and pandas, but I currently lack an out-of-core workflow for large datasets. I'm not talking about "big data" that requires a distributed network, but rather files too large to fit in memory but small enough to fit on a hard-drive.

My first thought is to use HDFStore to hold large datasets on disk and pull only the pieces I need into dataframes for analysis. Others have mentioned MongoDB as an easier to use alternative. My question is this:

What are some best-practice workflows for accomplishing the following:

Loading flat files into a permanent, on-disk database structure

Querying that database to retrieve data to feed into a pandas data structure

Updating the database after manipulating pieces in pandas

Real-world examples would be much appreciated, especially from anyone who uses pandas on "large data".

Edit -- an example of how I would like this to work:

Iteratively import a large flat-file and store it in a permanent, on-disk database structure. These files are typically too large to fit in memory.

In order to use Pandas, I would like to read subsets of this data (usually just a few columns at a time) that can fit in memory.

I would create new columns by performing various operations on the selected columns.

I would then have to append these new columns into the database structure.

I am trying to find a best-practice way of performing these steps. Reading links about pandas and pytables it seems that appending a new column could be a problem.

Edit -- Responding to Jeff's questions specifically:

I am building consumer credit risk models. The kinds of data include phone, SSN and address characteristics; property values; derogatory information like criminal records, bankruptcies, etc... The datasets I use every day have nearly 1,000 to 2,000 fields on average of mixed data types: continuous, nominal and ordinal variables of both numeric and character data. I rarely append rows, but I do perform many operations that create new columns.

Typical operations involve combining several columns using conditional logic into a new, compound column. For example, if var1 > 2 then newvar = 'A' elif var2 = 4 then newvar = 'B'. The result of these operations is a new column for every record in my dataset.

Finally, I would like to append these new columns into the on-disk data structure. I would repeat step 2, exploring the data with crosstabs and descriptive statistics trying to find interesting, intuitive relationships to model.

A typical project file is usually about 1GB. Files are organized into such a manner where a row consists of a record of consumer data. Each row has the same number of columns for every record. This will always be the case.

It's pretty rare that I would subset by rows when creating a new column. However, it's pretty common for me to subset on rows when creating reports or generating descriptive statistics. For example, I might want to create a simple frequency for a specific line of business, say Retail credit cards. To do this, I would select only those records where the line of business = retail in addition to whichever columns I want to report on. When creating new columns, however, I would pull all rows of data and only the columns I need for the operations.

The modeling process requires that I analyze every column, look for interesting relationships with some outcome variable, and create new compound columns that describe those relationships. The columns that I explore are usually done in small sets. For example, I will focus on a set of say 20 columns just dealing with property values and observe how they relate to defaulting on a loan. Once those are explored and new columns are created, I then move on to another group of columns, say college education, and repeat the process. What I'm doing is creating candidate variables that explain the relationship between my data and some outcome. At the very end of this process, I apply some learning techniques that create an equation out of those compound columns.

It is rare that I would ever add rows to the dataset. I will nearly always be creating new columns (variables or features in statistics/machine learning parlance).

Answer

I routinely use tens of gigabytes of data in just this fashion
e.g. I have tables on disk that I read via queries, create data and append back.

It's worth reading the docs and late in this thread for several suggestions for how to store your data.

Details which will affect how you store your data, like:
Give as much detail as you can; and I can help you develop a structure.

Size of data, # of rows, columns, types of columns; are you appending
rows, or just columns?

What will typical operations look like. E.g. do a query on columns to select a bunch of rows and specific columns, then do an operation (in-memory), create new columns, save these.
(Giving a toy example could enable us to offer more specific recommendations.)

After that processing, then what do you do? Is step 2 ad hoc, or repeatable?

Input flat files: how many, rough total size in Gb. How are these organized e.g. by records? Does each one contains different fields, or do they have some records per file with all of the fields in each file?

Do you ever select subsets of rows (records) based on criteria (e.g. select the rows with field A > 5)? and then do something, or do you just select fields A, B, C with all of the records (and then do something)?

Do you 'work on' all of your columns (in groups), or are there a good proportion that you may only use for reports (e.g. you want to keep the data around, but don't need to pull in that column explicity until final results time)?

Solution

Ensure you have pandas at least 0.10.1 installed.

Read iterating files chunk-by-chunk and multiple table queries.

Since pytables is optimized to operate on row-wise (which is what you query on), we will create a table for each group of fields. This way it's easy to select a small group of fields (which will work with a big table, but it's more efficient to do it this way... I think I may be able to fix this limitation in the future... this is more intuitive anyhow):
(The following is pseudocode.)

import numpy as np

import pandas as pd

# create a store
store = pd.HDFStore('mystore.h5')

# this is the key to your storage:
#    this maps your fields to a specific group, and defines 
#    what you want to have as data_columns.
#    you might want to create a nice class wrapping this
#    (as you will want to have this map and its inversion)  

group_map = dict(
    A = dict(fields = ['field_1','field_2',.....], dc = ['field_1',....,'field_5']),
    B = dict(fields = ['field_10',......        ], dc = ['field_10']),
    .....
    REPORTING_ONLY = dict(fields = ['field_1000','field_1001',...], dc = []),

)

group_map_inverted = dict()
for g, v in group_map.items():

    group_map_inverted.update(dict([ (f,g) for f in v['fields'] ]))

Reading in the files and creating the storage (essentially doing what append_to_multiple does):

for f in files:
   # read in the file, additional options hmay be necessary here
   # the chunksize is not strictly necessary, you may be able to slurp each 
   # file into memory in which case just eliminate this part of the loop 
   # (you can also change chunksize if necessary)

   for chunk in pd.read_table(f, chunksize=50000):
       # we are going to append to each table by group
       # we are not going to create indexes at this time
       # but we *ARE* going to create (some) data_columns

       # figure out the field groupings
       for g, v in group_map.items():
             # create the frame for this group
             frame = chunk.reindex(columns = v['fields'], copy = False)    


             # append it
             store.append(g, frame, index=False, data_columns = v['dc'])

Now you have all of the tables in the file (actually you could store them in separate files if you wish, you would prob have to add the filename to the group_map, but probably this isn't necessary).

This is how you get columns and create new ones:

frame = store.select(group_that_I_want)
# you can optionally specify:

# columns = a list of the columns IN THAT GROUP (if you wanted to
#     select only say 3 out of the 20 columns in this sub-table)
# and a where clause if you want a subset of the rows

# do calculations on this frame
new_frame = cool_function_on_frame(frame)

# to 'add columns', create a new group (you probably want to
# limit the columns in this new_group to be only NEW ones
# (e.g. so you don't overlap from the other tables)

# add this info to the group_map
store.append(new_group, new_frame.reindex(columns = new_columns_created, copy = False), data_columns = new_columns_created)

When you are ready for post_processing:

# This may be a bit tricky; and depends what you are actually doing.
# I may need to modify this function to be a bit more general:
report_data = store.select_as_multiple([groups_1,groups_2,.....], where =['field_1>0', 'field_1000=foo'], selector = group_1)

About data_columns, you don't actually need to define ANY data_columns; they allow you to sub-select rows based on the column. E.g. something like:

store.select(group, where = ['field_1000=foo', 'field_1001>0'])

They may be most interesting to you in the final report generation stage (essentially a data column is segregated from other columns, which might impact efficiency somewhat if you define a lot).

You also might want to:

create a function which takes a list of fields, looks up the groups in the groups_map, then selects these and concatenates the results so you get the resulting frame (this is essentially what select_as_multiple does). This way the structure would be pretty transparent to you.

indexes on certain data columns (makes row-subsetting much faster).

enable compression.

Let me know when you have questions!

How to initialize ThreeTen Android backport in a Unit test

I use this library for storing date & time related data in my app. When the application starts, AndroidThreeTen is initialized first to function properly. So I want to ask how to initialize it when unit testing? E.g. I want to test using LocalDate, LocalDateTime, etc.

My current way is like this:

class OverviewViewModelTest {


    @Rule
    @JvmField
    val rule = InstantTaskExecutorRule()

    @Before
    fun setup() {
        AndroidThreeTen.init(Application())
    }

    //...

}

But it throws this error:

java.lang.ExceptionInInitializerError
    at org.threeten.bp.ZoneRegion.ofId(ZoneRegion.java:143)
    at org.threeten.bp.ZoneId.of(ZoneId.java:358)
    at org.threeten.bp.ZoneId.of(ZoneId.java:286)
    at org.threeten.bp.ZoneId.systemDefault(ZoneId.java:245)

    at org.threeten.bp.Clock.systemDefaultZone(Clock.java:137)
    at org.threeten.bp.LocalDate.now(LocalDate.java:165)
Caused by: java.lang.RuntimeException: Method getAssets in android.content.ContextWrapper not mocked. See http://g.co/androidstudio/not-mocked for details.
    at android.content.ContextWrapper.getAssets(ContextWrapper.java)
    at com.jakewharton.threetenabp.AssetsZoneRulesInitializer.initializeProviders(AssetsZoneRulesInitializer.java:22)
    at org.threeten.bp.zone.ZoneRulesInitializer.initialize(ZoneRulesInitializer.java:89)
    at org.threeten.bp.zone.ZoneRulesProvider.(ZoneRulesProvider.java:82)
    ... 32 more

So how can I get this library to work in unit tests?

redirect - PHP redirecting not showing content of the page redirected to

Page1.php code;

...
header('Location: editt.php?msg=landing&msisdn='.$msisdn);
...

page2.php code;

$msisdn = $_GET['msisdn'];
$msg = $_GET['msg'];
echo 'Page 2 Content '.$msisdn;

Answer

why not using header as container of your data :

header('msg: msgValue');
header('msisdn: msisdnValue');
header('Location: editt.php');

in edit.php

$headers = apache_request_headers();

foreach ($headers as $header => $value) {
    echo "$header: $value 
\n";
}

php - Why won't my data go into my database?

I would like these values to go into my database but it just won't do it.

Form Code


   Navn:

PHP Code

session_start();
include 'dbh.php';

    $name = $_POST['name'];
    $cig = $_POST['cig'];
    $brand = $_POST['brand'];
    $unit = $_POST['unit'];
    $pris = $_POST['pris'];
    $ligther = $_POST['ligther'];

    $place = $_POST['place'];
    $tid = $_POST['tid'];
    $howlong = $_POST['howlong'];
    $day = $_POST['day'];
    $pers = $_POST['pers'];

    $sql = "INSERT INTO prod (name, cig, brand, unit, pris, lighter, place, tid, howlong, day, pers) 
            VALUES ('$name', '$cig', '$brand', '$unit', '$pris', '$ligther', '$place', '$tid', '$howling', '$day', '$pers')";
    $result = mysqli_query($conn, $sql);


header("Location: index.php");

Answer

You are on the right track but really need to refine your code. First, the SQL syntax you are using is out of date and susceptible to injection. You should read about prepared statements and make that your way of coding in the future.

Second, as one of the other posters mentioned, you are including the same file more than once.

Third, you should also look at the ISSET command and include it via an if / else statement so that your code runs properly. Right now, you are running the entire code on the server with no data to fill it. You need to look for a proper "Submit" via ISSET and execute the PHP code at that point.

See here for more help:

how to insert into mysql using Prepared Statement with php

php - Problem accessing the value of the json with a point in its name

I need to access the value of the json that contains a point in its name.

I would like to access the "proy_sim.name" field but I do not know how

{        
    "prsp_sol": [
        {
            "proy_sim.name": "Vehículos",
            "prsp_def.name": "TRACTOR"  
        }
    ]
}

Answer

After decoding with json_decode() you'll realize that there is an additional array you're not accounting for:

$json = '{
    "prsp_sol": [
        {
            "proy_sim.name": "Vehículos",
            "prsp_def.name": "TRACTOR"
        }
    ]
}';


$decoded = json_decode($json, true); // true makes it an array
print_r($decoded);

echo $decoded['prsp_sol'][0]['proy_sim.name'];
//-----------------------^ additional nested array

The output:

Array
(
    [prsp_sol] => Array
        (
            [0] => Array
                (
                    [proy_sim.name] => Vehículos
                    [prsp_def.name] => TRACTOR
                )
        )

)

Vehículos

Here is an example

The point in the name is irrelevant.

python - Delete column from pandas DataFrame

When deleting a column in a DataFrame I use:

del df['column_name']

And this works great. Why can't I use the following?

del df.column_name

As you can access the column/Series as df.column_name, I expect this to work.

Answer

As you've guessed, the right syntax is

del df['column_name']

It's difficult to make del df.column_name work simply as the result of syntactic limitations in Python. del df[name] gets translated to df.__delitem__(name) under the covers by Python.

javascript - Why does Math.cos(90 * Math.PI/180) yield 6.123031769111... and not zero?

I convert degrees to radians (degrees * Math.PI/180) but why does the following:

Math.cos(90 * Math.PI/180)

yield 6.123031769111... and not zero?

I'm trying to perform 2D rotations uses matrixes and the results are completely out of whack.

comparison - JavaScript: Simple way to check if variable is equal to two or more values?

Is there an easier way to determine if a variable is equal to a range of values, such as:

if x === 5 || 6

rather than something obtuse like:

if x === 5 || x === 6

Answer

You can stash your values inside an array and check whether the variable exists in the array by using [].indexOf:

if([5, 6].indexOf(x) > -1) {
  // ...
}

If -1 is returned then the variable doesn't exist in the array.

plot explanation - Who was the girl and why was she seen in Wrecked?

I recently watched Wrecked and have some unclear parts. I realized that during the robbery, the men (robbers) dragged along the Man only to use his vehicle. During that, we are shown the dumbfounded Girl sitting on a bench after the quarrel (?) with the Man. So I concluded that they are the fiance and fiancee.

It also explains why the Man sees the Girl almost every time after the accident (she is constantly in his memory).

But the problem: why does He try to (or for real) kill Her? And also, whenever He sees Her, with increasing intensity toward the end of the movie, she does not look good or happy at all.

Does it have something to with the Man getting a flashback (while leaning on the wrecked car), about He himself shooting the Girl?

I would be really thankful to anyone who could clear this up for me. I may be missing something, but can't get what it is...

Is there any correlation between the use of method acting and award or box office success?

Method acting as a concept has been around since the 1930s, derived from Stanislavski's system (predating the former by a few decades) of actors creating 'real' emotions by Meisner, Adler and Lewis.

Though it has evolved slightly over time, the core component is that the actor employing the technique of method acting 'becomes' the character they're playing by taking the (either hypothetical or explicit) psychological and emotional motivations of the character and making them their own.

Actors utilizing the techniques promoted by Stanislavski, Meisner, Adler or Lewis include:
Marlon Brando, Robert DeNiro, Al Pacino, Daniel Day Lewis, Meryl Streep, Marilyn Monroe, etc. etc.

Is there any correlation between using these techniques and success (success being measured by awards, box office tallies, long career)?

python - How to read a large file line by line

I want to iterate over each line of an entire file. One way to do this is by reading the entire file, saving it to a list, then going over the line of interest. This method uses a lot of memory, so I am looking for an alternative.

My code so far:

for each_line in fileinput.input(input_file):

    do_something(each_line)

    for each_line_again in fileinput.input(input_file):
        do_something(each_line_again)

Executing this code gives an error message: device active.

Any suggestions?

The purpose is to calculate pair-wise string similarity, meaning for each line in file, I want to calculate the Levenshtein distance with every other line.

Answer

The correct, fully Pythonic way to read a file is the following:

with open(...) as f:
    for line in f:
        # Do something with 'line'

The with statement handles opening and closing the file, including if an exception is raised in the inner block. The for line in f treats the file object f as an iterable, which automatically uses buffered I/O and memory management so you don't have to worry about large files.

There should be one -- and preferably only one -- obvious way to do it.

mysql - How do I see all foreign keys to a table or column?

In MySQL, how do I get a list of all foreign key constraints pointing to a particular table? a particular column? This is the same thing as this Oracle question, but for MySQL.

Answer

For a Table:

SELECT 
  TABLE_NAME,COLUMN_NAME,CONSTRAINT_NAME, REFERENCED_TABLE_NAME,REFERENCED_COLUMN_NAME
FROM
  INFORMATION_SCHEMA.KEY_COLUMN_USAGE
WHERE

  REFERENCED_TABLE_SCHEMA = '' AND
  REFERENCED_TABLE_NAME = '';


For a Column:


SELECT 
  TABLE_NAME,COLUMN_NAME,CONSTRAINT_NAME, REFERENCED_TABLE_NAME,REFERENCED_COLUMN_NAME
FROM
  INFORMATION_SCHEMA.KEY_COLUMN_USAGE

WHERE
  REFERENCED_TABLE_SCHEMA = '' AND
  REFERENCED_TABLE_NAME = '
' AND
  REFERENCED_COLUMN_NAME = '';


Basically, we changed REFERENCED_TABLE_NAME with REFERENCED_COLUMN_NAME in the where clause.






-

August 30, 2017



No comments:
  









Email ThisBlogThis!Share to XShare to FacebookShare to Pinterest
















Why this behavior for javascript code?








        
Recently one of my friends asked me the output of following code




var length = 10;


function fn() {
    console.log(this.length);
}

var obj = {
  length: 5,
  method: function(fn) {
    fn();
    arguments[0]();
  }

};

obj.method(fn, 1);





I thought the answer would be 10 10 but surprisingly for second call i.e. arguments[0](); the value comes out to be 2 which is length of the arguments passed.
In other words it seems arguments[0](); has been converted into fn.call(arguments);.



Why this behavior? Is there a link/resource for such a behavior? 

    
 Answer
 

The difference is because of the this context of each method call. 


In the first instance, because the call is merely fn();, the this context is Window. The var length = 10; variable declaration at the top happens in the root/Window context, so window.length should be 10, hence the 10 in the console from the first function call.


Because arguments is not an array but is actually an Object of type Arguments, calling arguments[0]() means that the this context of the function call will be of the parent Object, so this.length is equivalent to arguments.length, hence the 2 (since there are 2 arguments). (See @Travis J's answer for a more thorough explanation of this part.)


If you were to add



this.fn = fn;
this.fn();


to the method() function, the result would be 5.

    







-

August 30, 2017



No comments:
  









Email ThisBlogThis!Share to XShare to FacebookShare to Pinterest


















mysql - What are some of the safest ways to connect to a database with
PHP?





I'm still new to PHP and MYSQL and I'm trying to learn both with modern coding techniques. All the stuff I find online seems to be outdated.


Can anybody suggest anything for me? I am also curious if the below code is outdated? If it is indeed outdated, can you suggest newer and safer methods?



    $connection = mysql_connect("localhost", "root", "");
    if (!$connection) {
        die("Oops, error happened: " . mysql_error());
    }
?>








-

August 30, 2017



No comments:
  









Email ThisBlogThis!Share to XShare to FacebookShare to Pinterest


















fread data.table in R doesn't read in column names








When reading in a data file into R, I can read it in either as a data.frame or a data.table using the data.table package.  I would prefer to use data.table in the future since it deals with large data better.  However, there are issues with both methods (read.table for data.frames, fread for data.tables) and I'm wondering if there's a simple fix out there.


When I use read.table to produce a data.frame, if my column names include colons or spaces, they are replaced by periods, which I do not want.  I want the column names to be read in "as is."


Alternatively, when I use fread to produce a data.table, my column names are not read in at all, which is obviously not desired.


Check out this gist below for a reproducible example:


https://gist.github.com/jeffbruce/b966d41eedc2662bbd4a



Cheers

    
 Answer
 

R always try to convert column names to ensure that they are valid variable names, hence it adds periods in place of spaces and colons. If you dont want that you can use check.names=FALSE while using read.table


df1<-read.table("data.txt",check.names = FALSE)

sample(colnames(df1),10)
 [1] "simple lobule white matter"                       
 [2] "anterior lobule white matter"                     
 [3] "hippocampus"                                      

 [4] "lateral olfactory tract"                          
 [5] "lobules 1-2: lingula and central lobule (ventral)"
 [6] "Medial parietal association cortex"               
 [7] "Primary somatosensory cortex: trunk region"       
 [8] "midbrain"                                         
 [9] "Secondary auditory cortex: ventral area"          
[10] "Primary somatosensory cortex: forelimb region"  


you can see that colnames are kept as it is.


    







-

August 30, 2017



No comments:
  









Email ThisBlogThis!Share to XShare to FacebookShare to Pinterest














Newer Posts


Older Posts

Home


Subscribe to:
Posts (Atom)



casting - Why wasn&#39;t Tobey Maguire in The Amazing Spider-Man? -
Movies &amp; TV

In the Spider-Man franchise, Tobey Maguire  is an outstanding performer as a Spider-Man and also reprised his role in the sequels Spider-Man...









php - Codeigniter error: Parse error: syntax error, unexpected
  Parse error: syntax error, unexpected '(', expecting identifier (T_STRING) or variable (T_VARIABLE) or '{' or '$' ...





php - Looping through a table with Simple HTML DOM
I'm using Simple HTML DOM to extract data from a HTML document, and I have a couple of issues that I need some help with. On the line th...





What is android:ems attribute in Edit Text?
          In EditText   there is an attribute named android:ems . The description is "Makes the EditText  be exactly this many ems wide...















Search This Blog


Blog Archive








        ► 
      



2018

(1601)





        ► 
      



May 2018

(263)







        ► 
      



April 2018

(419)







        ► 
      



March 2018

(430)







        ► 
      



February 2018

(351)







        ► 
      



January 2018

(138)









        ▼ 
      



2017

(1575)





        ► 
      



October 2017

(102)







        ► 
      



September 2017

(430)







        ▼ 
      



August 2017

(444)

angularjs - Best practice looping through a JavaSc...
html - PHP - bad encoded turkish characters in MyS...
javascript - async.each not iterating when using p...
c# - Check if datagridview cell is null or empty
Why do I get garbage values when print arrays in J...
Is it common to see directors acting in their own ...
bash - How to pipe stderr, and not stdout?
bash - How to make the hardware beep sound in Mac ...
python - How do I pass a variable by reference?
c++ - How to use function with template, When it d...
php - How do I can access an outside function vari...
PHP variables to array
groovy - Regex capture for multiple occurrences an...
C/C++ floating point issue
multiple redirects with .htaccess
What does the "&" sign mean in PHP?
c++ - Why does changing 0.1f to 0 slow down perfor...
c# - Creating a comma separated list from IList or...
php - $function() and $$variable
javascript - Ensure variable is available before r...
r - Create an input variable that is dependent on ...
PHP random string generator
multithreading - Does the C++ volatile keyword int...
python - "Large data" work flows using pandas
How to initialize ThreeTen Android backport in a U...
redirect - PHP redirecting not showing content of ...
php - Why won't my data go into my database?
php - Problem accessing the value of the json with...
python - Delete column from pandas DataFrame
javascript - Why does Math.cos(90 * Math.PI/180) y...
comparison - JavaScript: Simple way to check if va...
plot explanation - Who was the girl and why was sh...
Is there any correlation between the use of method...
python - How to read a large file line by line
mysql - How do I see all foreign keys to a table o...
Why this behavior for javascript code?
mysql - What are some of the safest ways to connec...
fread data.table in R doesn't read in column names
sql server - T-SQL inserting a string into a text ...
c - Do I cast the result of malloc?
c++ - Another strange compiler error: calling a te...
Transform a string to a regex in JavaScript
c++ - Unresolved external symbol in object files
How do I remove a key from a JavaScript object?
c - Why does GCC use multiplication by a strange n...
http - JavaScript post request like a form submit
javascript - What is the difference between substr...
css3 - CSS: how to select parent's sibling
php - Why I'm getting "cannot modify header inform...
python - Catch multiple exceptions in one line (ex...
gluon - javafxports 'android' gradle task requires...
android - How to get device UUID without permission
analysis - Does Hattori Hanzo commit suicide in Ki...
plot explanation - What is the real purpose of the...
.htaccess - Remove "www." from url with forceful h...
plot explanation - What was David's motivation in ...
c - what is causing SIGSEV?
Highest Voted Linked Questions
gcc assembler output of printf arg list
php - direct double quoted text can be inserted in...
cannot modify header information - headers already...
javascript - How to check if the target variable i...
r - Extracting specific columns from a data frame
c++ - Solving "locally defined symbol" and "unreso...
php - include header page session error
php - Use case for output buffering as the correct...
php - Codeigniter: headers already sent error
php - Call to a member function bind_param() on a ...
Detecting an "invalid date" Date instance in JavaS...
No Appropriate Default Constructor For Node struct...
java - How do I add a servlet to replace scriptlet...
How to search for a user by name using Spotify Web...
algorithm - How to pair socks from a pile efficien...
Eclipse IDE for C/C++ Developers includes
effects - Were any non-CGI movies shot in their en...
Java Cloning - deep copy and shallow copy
jquery - How to detect document.ready in JavaScript?
Difference between isinstance and type in python
python - How to check if an argv refers to an exis...
php - Looping through a table with Simple HTML DOM
java - How can I trim beginning and ending double ...
python - Subtracting Truncated Numbers: Result isn...
intellij idea - "Default Activity Not Found" on An...
php - JavaScript disabled on Firefox meaning my bu...
javascript - Why does Math.cos(90 * Math.PI/180) y...
To what extent were the actors in Wedding Crashers...
php - You have an error in your SQL syntax: ALTER ...
email - Why shouldn't I use PHP's mail() function?
php - Laravel 5.4 throws 500 error after fresh ins...
C++ Inheritance of copy, move, swap, assignment an...
serialization - What is serialVersionUID in java, ...
PHP string parse error with necessary semicolon af...
How to deal with "java.lang.OutOfMemoryError: Java...
Javascript OOP and Inheritance Technique
c# - NullReferenceException thrown when testing cu...
Is Unique key Clustered or Non-Clustered Index in ...
go - Is there a way to iterate over a range of int...
Javascript, using one single local object instead ...
c# - Convert an IQueryable linq query to IEnumerab...
how to change place holder color with css3?








        ► 
      



July 2017

(425)







        ► 
      



June 2017

(174)



























Theme images by Ollustrator. Powered by Blogger.