# Python's Summer of Code 2015 Updates

## November 20, 2015

### Ankit Kumar(SunPy)

#### Python Software Foundation Phase II : Coding, Coding, Coding<br>By Ankit Kumar (JUN 05, 2015)

Status Update:

Last two weeks:
• Implemented URL Pattern and Data format for STEREO SIT Instrument - few changes to be made
• Implemented data format for all the other Instruments (committing to PR soon)
• Many changes to code few of which have been accommodated while rest have been kept to deal next week
• Struggled for a couple of days to read in headers along with data, but eventually gave up and read them separately
• Already made 3 PR, about to make 3 more ( 1 for each instrument, different branches )

A few places where I got intimidated a bit was when so many changes were told to be made to my PR and in trying to read headers along with data. Also I guess I lost a sense of direction a bit early since I made a significant PR on first day of coding itself and ongoing changes in the sunny platform which I hadn’t expected to affect my development. But soon I figured out what part of my development would remain unaffected and coded all of that thing so as to get some time to start developing on the changing parts of the platform.

Next two weeks:
• Implement URL Pattern for all the instruments, using scraper PR to download files based on URL generated and then
• Fitting it finally in the dataretriever,sources
• Complete the instruments classes implementation and dataretriever.sources

I hope to be more productive in the coming two weeks than I was in the last two.

Cheers
Ankit Kumar

-- Delivered by Feed43 service

#### Python Software Foundation Phase I : Getting Accepted !!, Community Bonding, Mailing List, Preparation, Welcome Package<...

So what’s up people. Long time huh!! It seems that getting done with semesters and getting accepted for GSOC 2015 and that too under such a prestigious organization as Python Software Foundation has made me fall into quite a celebratory mood. And thats why the heading is no longer Google Summer of Code 2015 Phase IV but Python Software Foundation Phase I :D

So I’ll just list the things that have been going on with me over the last few weeks. Lets start with getting accepted. Well I’ve already talked about it in the first paragraph so not a lot about that but just how grateful I am to PSF and SunPy for giving me this opportunity. I really look forward to a Successful GSOC 2015 Completion. Moving on to Community Bonding period. This has been a nice phase where I talked to my mentor, we decided upon work timings (due to different time zones, work methods (we are using troll cards) and of course IRC. I’ve already gone over the few code pieces that I require to understand to start and infant will be making my first commit tomorrow itself (with the beginning of Coding Period). One thing that obviously characterized this period of Community Bonding was the annoying GSOC mailing list. OMG are people seriously crazy !! :-( :-/ like seriously I had to change the mailing list settings to abridged daily updates because I was getting like 10 mails every day and that too about some really stupid and irrelevant things. But yeah like whatever. So I guess I covered uptill preparation part.

So lets move on to the Welcome package sent by Google to all accepted students. I must say that over the past few weeks I was excited about this package and it arrived just yesterday. So FTW it contains a moleskin notebook, a pen-cum-pencil, and GSOC sticker. It may also contain your payment card but since I live in India and the only option we have is to opt for Bank transfer so my package didn’t had the payment card. For others I am sure would have.

But Now all is done and now it’s time to get some perspective. By that I mean “LESS TALK, MORE CODE” and so signing off there is only one thing on my mind i.e.

Let The Coding Begin

-- Delivered by Feed43 service

#### Google Summer of Code 2015 Phase III : Github, Patch, PR and Proposal<br>By Ankit Kumar (MAR 21, 2015)

Now why does the heading specially mentions Github (its common among developers right !!) but as it turns out it actually was my first time using it and hell it was confusing so I had to ask out my friends, seniors and I did trouble my mentor a lot and I am really very sorry for that !!(David if you read this do know I am really sorry this is my first time using Github. I promise that thats the first thing I am going to do after I submit my proposal)

So more for later, right now I am just going to add some more commits improving my patch and then head straight into making the proposal which btw I am completely freaked about cause its such an important thing. So for now I just hope the proposal writing goes fine and I do get accepted. I’ll update this post later when I get done with my proposal and will be about to submit it because then that’ll officially be the end of Phase III of GSOC 2015. After that Phase IV will start that is waiting for the results but I think I am just gonna start reading up more code and atleast set up the skeleton of the code (or make some progress with it ) before I go back home for summers. But lets right now focus on the proposal thats in front of us.

And to quote David

The requirements to fulfil are the following:

• Create a PR with a patch. This can be to any part of SunPy, the above would do. (by the way, does not need to be accepted, but better if it is).
• Create a blog with some tags/categories (python, psf, gsoc,.. you choose) so what you write under it the PSF can grab it automatically.
• Write your proposal. To write your proposal you should try to get familiar with everything, but mostly with the part that you are going to contribute. So, if your project involve lightcurves, it would be good that you understand how they work and how we want them to work (https://github.com/sunpy/sunpy-SEP/pull/6) even if you are not going to do such project. For that, it will be helpful if you know how sunpy.maps work too. The unifiedDownloader is going through deep changes, so keeping an eye on what are they is also good.

-- Delivered by Feed43 service

#### Google Summer of Code 2015 Phase II : Joining mailing Lists, Introducing myself, Talking to mentors, waiting for replies...

Well I am obviously not gonna list down here the organizations and the projects that made it to my shortlist !! :P (for obvious reasons). I think I’ll only mention the final one that I end up preparing the proposal for but of course I don’t know what it will be so thats for later :P. Also seeing that I have got a lot of work to get done I am gonna keep this blog post short….save some talking for interacting with mentors !! Ha.

So well yeah the first of all steps is to join the mailing lists of the development community of the respective organization and introduce yourself there along with the specific idea that you have selected from the pool of ideas of that organization. After that its a slight wait for reply but the developing community is really very helpful and welcoming and will help you to get on with open source development even if you are a beginner in it (I was !!) But I found really good organization and the mentors were really very patient in replying to my mails and answering all the questions pretty descriptively.

Well after that comes reading up a bit more on the resources and links shared with you by the mentors and getting a sense of the organization and especially how you’re selected idea might integrate with their overall mission and code base. In my case it took a bit of time with few organizations while with others it was much more rapid. Now based on this newly gained knowledge we have to decide whether we might be able to develop that idea, be interested in it, and whether ultimately we get what the idea is. Well ultimately because you just have to you know get a gist of what it is although a bit more holistic gist because the rest is for the time when we start preparing the proposal. ( Note: It may seem that I simply had this all in my mind but no I had to talk to lots and lots of people, ex-GSOCers, some seniors at my college who were mentors for organizations and mentors out there in the dev-community. )

And Finally after all this well you get your final organization !! Right !!

Well life aint that straight. After doing all this one random day (two days ago) I was just looking through the Python organizations because I felt that only if there could be a bit more interesting organizations with a bit more interesting idea to me and for me and there I hit PSF page and I am like “I definitely didn’t see all these new organizations before”. And so I have sent out the mails and introductions so now lets see what happens!! So then the whole process of Phase 2 was repeated and guess what The final Organization that I end up finally selecting is SunPy under Python Software Foundation. What I would especially like to mention here is the speed with which my mentor from SunPy helped me pick up the necessary bits and get started since obvious I was a bit late. So now here I am finally with one single project and setting up the dev environment and using bits of it. And I guess now lets move on to Phase III of GSOC 2015. So lets get our hands dirty now and deal some blows with the SunPy codebase!!

-- Delivered by Feed43 service

#### Google Summer of Code Phase 1: Shortlisting of Organisations to numbers I can deal with.<br>By Ankit Kumar (Mar 13, 2015...

Ok so Here is my first blog post for my Google Summer of Code 2015 Proposal to Sunpy, Python Software Foundation. So hmm how was my experience applying for GSOC. Hmm Let me think of which word is more intense than tiring because woof is it tough man!! So I started looking for Organizations right after the List of Accepted Organization was posted and my god there were 137 organizations in total and thats kind of a lot!! So how do I filter down an organization thats suited for me. You know one important thing about me is that I love to learn and I have specific interests that I like to explore so I needed an organization that suited to my interest or speaking specifically that I be interested in continuing to work even if they don't pay me at all. See that is how I choose whether I will or not do anything. That is where I get my persistence from. And this may sound crazy because hey you might not be interested in anything but I am kinda unusual on that note. I am greatly interested in technology, business, entrepreneurship, astronomy, physics, and most importantly programming.

So now 137. Well I know C,C++, Java, Python and web technologies so how do I start. Lets rewind to when exactly did I start loving programming or when exactly did it start speaking to me. I started out coding in my first year of college when we had a programming course in C. So it was nice I got to know about a very good programming language, C. And I aced that course too not to say because I liked it a lot….it felt singular I mean it was not as complex as talking to people it was simple and I liked that. Although at a lot of times I felt that it was restrictive I mean I couldn’t do everything that I wanted to do with it. I guess it was because it probably couldn;t be covered in a single semester or that the course for simply an introductory course so the ydidnt wanna complicate it enough so that others couldn’t follow. I wanted it to be able to talk to other files read from them write to it and wanted it to do this seamlessly without a lot of hassle but as it turnout that it wasn’t all easy. It almost always remained in the console.

But come second year and I am introduced to online course on Python and I delve more into it. And soon enough I learn how to make gui applications in it, read files, write to them plot graphs make it talk to internet and that was liberating and that is the story of how I fell in love for love for the second time. And it was similar to the adrenaline rush that I got when I fell for Physics in ninth grade. So there it was yeah I felt liberated and powerful with python because it enabled me. Another thing that I have been particularly inclined to has been building things and then showing them off to people that it worked !! Ha

So there was my decision — search for Python tag. and now we were down to 40 organizations and man the real struggle starts now. So now what I do is open up the ideas page of all the 40 organization on side tabs and hmm over two days read up the projects, filtering through them. So even 40 is a lot man. So I took up a simple criteria — I am just gonna select the projects and therefor the organization if my current skill matched the requisite skill for that idea. I have fair amount of experience using and developing in python and its libraries (following from the fact that it made me feel liberated). This took up a while. And guess I ended up with some 20 organization ideas page. Thats nice hah. So moving on I cut through the list by selecting the organizations that also coded about things that interested me. And this was the most time consuming process of all cause I had to read through each of the idea and read it like saying cover to cover and the googling about it seeing some online examples of what the organization did and what it was used for and after about a hell of a time I ended up on about 8 organizations which for me was decent to start talking to mentors, to hang out on IRC, introduce myself and you know start looking at a specific idea from each ideas page. So basically that meant 8 ideas selected down from 137 organizations times average of 7-8 ideas per ideas page ie 959-1096 ideas. Nice huh !!

I had my spring break during this time so I was a bit merrier so I guess it took me a bit more time than it should have to get it done.

But whatever happens ….. I am moving on to next phase and thats all that mattered now !! So Now let the talking begin. !! It was finally time for Phase 2.

-- Delivered by Feed43 service

## November 09, 2015

### Vipul Sharma(MoinMoin)

#### Playing Pacman with gestures

scroll to the end of this post to see how this image was captured

Hello! Lately, I've been striking off some tasks from my long pending TODO list. First, I finished off with summrizer and now this, which has been in my list since quite a long time!

After implementing a simple hand gesture recognizer using Python + OpenCV, I always wanted to do something more exciting and fascinating, like simulating different keyboard events based on the gesture to achieve small tasks like opening files, folders, applications etc. But what's the fun in doing such boring tasks.

Therefore, I thought of playing the old school game of using gestures! No keyboard; only gestures in front of a webcam :D

For all the impatient folks, TL;DR here is the link to the code : https://github.com/vipul-sharma20/gesture-pacman

## Implementation of gesture mechanism

In layman's terms:
• Capture image frame containing any recognizable object
• Detect the object
• Check if/where the object moves
• Assign tasks (keyboard key press) as per different types of movement
The above algorithm seems to be quite easy to implement and yes, its very easy :D Please read further for more detailed explanation of each step.

#### 1. Capture Frame

Capturing an image frame is the easiest task. We want to sense the gestures therefore, we'll have to continue taking frames forever to record the change in the location of the object or hand or anything recognizable which we can track and use as a mode to input the gestures.

Here is a test frame which I will be using to demonstrate all the processes involved:

You may notice that here I am holding a #9b465d colored (some people call it "pink") square paper. We can use this to input gestures by moving it in different directions in front of the webcam and then execute appropriate tasks based on its motion.

### 2. Detecting Object

#### Thresholding

In very basic terms, thresholding is like a Low Pass Filter by allowing only particular color ranges to be highlighted as white while the other colors are suppressed by showing them as black.

Before thresholding, the captured image is flipped (I've already flipped the above image) and converted from BGR to HSV.

BGR to HSV transformed image

Initially, I thought of thresholding using Otsu's Binarization method. In this method, OpenCV automatically calculates/approximates the threshold value of a bimodal image from its image histogram. But for optimal results, we may need a clear background in front of the webcam which is not possible in general. Also, what's the fun in that ;) So, I went with the traditional method of global thresholding by providing a range of min and max HSV values as a threshold range for the color pink. In this way, we will not be affected by the background unless it has something of the same color as the object in our hand.

Notice the difference in thresholding using Otsu's method and global method:

Thresholding using Otsu's Binarization method

You can notice here that there is a lot of white whereas we want only our object to be highlighted. We can obviously decide an ROI before thresholding, but that would be more of a restriction in the available region for moving the object.

Therefore, a global thresholding is more desirable.

Global Thresholding method

For better results, we can also try thresholding after performing Gaussian Blurring on the original image. We blur the image for smoothing and to reduce noise and details from the image. We are not interested in the details of the image but in the shape/size of the object to track. In my implementation, I've NOT used this step as it is a little slow in terms of realtime processing but you might like to see the effect of blurring in thresholding

Original image after Gaussian Blur

Applying Global Thresholding on blurred frame

Here, we can see that thresholding after blurring has lesser noise and more discrete white regions than the one thresholded without blurring. Unfortunately, we'll have to compromise this optimal performance as it is a little slow. But, we can get the desired results after some tweaks even without implementing this step. Just read furthter :D

### Contour Detection and Bounding Rectangle

Once the image is thresholded, we need to create a bounding rectangle so that we always have the exact coordinates of the object in our hand in real-time. To achieve this, we will first need to extract all the contours from the thresholded image and then selecting the contour which has the max area. This max area contour will be the object around which, we will create a bounding rectangle. More precisely, we can track the coordinates of the moving object in real-time by tracking the centroid of the bounding rectangle.

Creating bounding rectangles around all the contours detected from the thresholded image

The good thing is, we have a bounding rectangle around the object we want to track and the bad thing is clearly visible. We can correct this by creating the bounding rectangle only around the contour which has the maximum area. If we notice the thresholded image again, we can see that the largest white colored area is of the pink colored square and that's what we want to track. Therefore, by creating a rectangle around the largest area we get the desired result.

The red mark inside the rectangle is the centroid of the bounding rectangle.

### 3. Check if/where object moves

For this, we can define our own quadrants on a frame and locate the position of the centroid of the bounding rectangle in those quadrants. Based on the quadrant in which the point lies, we can trigger an appropriate keyboard event.

Here, I've created 4 rectangular divisions for triggering 4 different movements: up, down, left, right. Looking closely we can see that the centroid lies in the upper division hence, we can simulate an "Up" key press event and similarly we can trigger left, down, right key press events based on the location of the centroid among the quadrants.

For simulating keyboard key press events, I've used pyautogui library.

Here is the link to the code :

The big question: Where is Pacman ??

Now that we have created the script to input gestures and trigger keyboard events, we can now try it by playing Pacman :D

Below is the video of me playing Pacman with gestures. This is not exactly the same old classic Pacman which had the kill screen bug, but it's good enough to demonstrate the working :)

In case you were wondering how the header image was captured...

## October 27, 2015

### Vipul Sharma(MoinMoin)

#### Summrizer: Text summarizer (Implementation) & Context Extractor

Recently, I've been working on implementing a text summarization script in Python (previous blog post). I've built a naive implementation of a text summarizer and also a custom Text Context Analyzer which is basically a kind of self-customized Part Of Speech and Noun Phrase tagger which determines that what the content is about i.e. the important context of the text content.

For all the impatient folks, TL;DR here is the link to the code : https://github.com/vipul-sharma20/summrizer

NOTE: Works only for English language :)

## Implementing Summarizing Script

This summary script works well for news articles and blog posts and that's the basic motive of implementing this script. It inputs the text content, splits it into paragraphs, splits it into sentences, filter out stopwords, calculates score (relevance) of each sentence, and on the basis of the scores assigned to each sentence it displays the most relevant results depending upon how concise we want our summary to be.

Splitting the content into paragraphs and then to sentence is easier than rest of the tasks so it can be skipped. Before implementing the scoring algorithm, I filtered out the stopwords. Stopwords are the most commonly used words in any language. For example: In English we have words like this, that, he, she, I...etc. These are among the most frequently used words in the English language which may not have significance in deciding the importance of a sentence. Therefore, it is required to remove these stopwords from the text content so that the scoring algorithm does not need to score a sentence based on some irrelevant words.

### Scoring

The scoreSentence() function receives two sentences, finds the intersection between the two i.e. the words/tokens common in both the sentences and then the result is normalized by the average length of the two sentence.

avg = len(s1)+len(s2) / 2.0
score = len(s1.intersection(s2)) / avg

The most important and interesting part is: how to make use of this scoring algorithm? Here, I've created an all-pair-score-graph of sentences i.e. a completely connected and weighted graph which contains scores between all the pairs of sentences in a paragraph. The function, sentenceGraph() performs this task.
Suppose scoreGraph is the obtained weighted graph. So, scoreGraph[0][5] will contain the score between sentence no. 1 and sentence no. 6. And similarly, there will be separate intersection score for all the pairs. Therefore, if there are 6 sentences in a paragraph, we will have a 6x6 matrix as a score-graph.

The scoreGraph consist of paired scores. So, to calculate individual score of each sentence, we sum up all the intersection of a particular sentence with the other sentences in the paragraph and store the result in a dictionary with the sentence as the key and the calculated score as the value. The function, build() performs this task.

### Summary

To build the summary from the final score dictionary, we can choose as per our need, depending upon the conciseness of the summary required.

Complete code of summarizing script :

### Example

I've tested the scoring algorithm on a paragraph of an article from techcrunch:

"The BBC has been testing a new service called SoundIndex, which lists the top 1,000 artists based on discussions crawled from Bebo, Last.fm, Google Groups, iTunes, MySpace and YouTube. The top five bands according to SoundIndex right now are Coldplay, Rihanna, The Ting Tings, Duffy and Mariah Carey , but the index is refreshed every six hours. SoundIndex also lets users sort by popular tracks, search by artist, or create customized charts based on music preferences or filters by age range, sex or location. Results can also be limited to just one data source (such as Last.fm)."

### Result

The BBC has been testing a new service called SoundIndex, which lists the top 1,000 artists based on discussions crawled from Bebo, Last.fm, Google Groups, iTunes, MySpace and YouTube : 0.338329361595

The top five bands according to SoundIndex right now are Coldplay, Rihanna, The Ting Tings, Duffy and Mariah Carey , but the index is refreshed every six hours. : 0.286057692308

SoundIndex also lets users sort by popular tracks, search by artist, or create customized charts based on music preferences or filters by age range, sex or location. : 0.285784751456

Results can also be limited to just one data source (such as Last.fm). : 0.237041838857

As per the context of the news, it is evident that the first two sentences are the most relevant part of the paragraph and hence have higher score than the rest of the sentences.

Speaking of Coldplay, I highly recommend : :D

### Comparison

I've tried various text compacting and text summarizing websites and used the above paragraph to test their performance and here are the results:

• summarized it to "Sound Index also lets users sort by popular tracks, search by artist, or create customized charts based on music preferences or filters by age range, sex or location."
• http://freesummarizer.com summarized it to "SoundIndex also lets users sort by popular tracks, search by artist, or create customized charts based on music preferences or filters by age range, sex or location."
• http://smmry.com did nothing but just converted the paragraph into sentences and displayed them :|
• summarized it to the following when I set 50 % for summary limit: "The BBC has been testing a new service called SoundIndex, which lists the top 1,000 artists based on discussions crawled from Bebo, Last.fm, Google Groups, iTunes, MySpace and YouTube. The top five bands according to SoundIndex right now are Coldplay, Rihanna, The Ting Tings, Duffy and Mariah Carey , but the index is refreshed every six hours."
http://textcompactor.com produced the same result as my script when used for 50% compaction limit :D others were pretty disappointing.

Try copy-pasting the paragraph used in the example to verify the results.

## Context Extractor

The summarizing script, as explained above, works on top of a scoring algorithm. One might need to extract the only the context or the main topics from a sentence so as to know what the text content is about. This provides a very abstract idea about the content which we might be dealing with.

The phrase structure of a sentence in English is of the form:
$S \to NP \quad VP$
The above rule means that a sentence (S) consists of a Noun Phrase (NP) and a Verb Phrase (VP). We can further define grammar for a Noun Phrase but let's not get into that :)

A Verb Phrase defines the action performed on or by the object whereas a Noun Phrase function as verb subject or object in a sentence. Therefore, NP can be used to extract the important topics from the sentences.

I've used Brown Corpus in Natural Language Toolkit (NLTK) for Part Of Speech (POS) tagging of the sentences and defined custom Context Free Grammar (CFG) for extracting NP.

"The Brown Corpus was the first million-word electronic corpus of English, created in 1961 at Brown University. This corpus contains text from 500 sources, and the sources have been categorized by genre, such as newseditorial, and so on."

See more at: NLTK-Brown Corpus

A part-of-speech tagger, or POS-tagger, processes a sequence of words, and attaches a part of speech tag to each word
 >>> text = word_tokenize("And now for something completely different")>>> nltk.pos_tag(text)[('And', 'CC'), ('now', 'RB'), ('for', 'IN'), ('something', 'NN'),('completely', 'RB'), ('different', 'JJ')]
See more at: NLTK-Using a Tagger

In my context extractor script, I've used unigram as well as bigram POS tagging. A unigram tagger is based on a simple statistical algorithm: For every token/word assign a tag that is more likely for that token/word which is decided as per the lookup match found in the trained data. The drawback of unigram tagging is, we can just tag a token with a "most likely" tag in isolation with the larger context of the text.

Therefore, for better results we use an n-gram tagger, whose context is current token along with the POS tags of preceding n-1 tokens. The problem with n-gram taggers is sparse data problem which is quite immanent in NLP.

"As n gets larger, the specificity of the contexts increases, as does the chance that the data we wish to tag contains contexts that were not present in the training data."

Even for n=2 i.e. in case of a bigram tagger we can face this sparse data problem. Therefore to avoid this, I've initially used a Bigram Tagger and if it fails to tag some tokens, it backs off to the Unigram Tagger for tagging and if even the Unigram Tagger fails to tag the tokens, it backs off to a RegEx Tagger which has some naive rules for tagging nouns, adjectives, cardinal numbers, determinants etc.

I've also defined a custom CFG (Context Free Grammar) to extract Noun Phrases from the POS tagged list of tokens.
(I can discuss how the custom CFG works if someone is interested :) ! )

Here is the code which performs this task : context.py

### Example

I've used the same content as used in the summarizer script as the test example for context extracting script:

"The BBC has been testing a new service called SoundIndex, which lists the top 1,000 artists based on discussions crawled from Bebo, Last.fm, Google Groups, iTunes, MySpace and YouTube. The top five bands according to SoundIndex right now are Coldplay, Rihanna, The Ting Tings, Duffy and Mariah Carey , but the index is refreshed every six hours. SoundIndex also lets users sort by popular tracks, search by artist, or create customized charts based on music preferences or filters by age range, sex or location. Results can also be limited to just one data source (such as Last.fm)."

#### Result

['BBC', 'new service', 'SoundIndex', 'Bebo', 'Last.fm', 'Google Groups', 'MySpace', 'YouTube', 'SoundIndex', 'Coldplay', 'Rihanna', 'Ting Tings', 'Duffy', 'Mariah Carey', 'SoundIndex', 'lets users sort', 'popular tracks', 'music preferences', 'age range', 'data source', 'Last.fm']

This is the list of topics discussed in the test paragraph which looks good :D. NLP can never yield 100% accurate results. All we can do is train using the data set therefore in this case, some undesired results may arise.

Please suggest me some improvements :) I would love to hear your views :D

## October 22, 2015

### Aman Singh(Scikit-image)

#### Efficient and Easy to code Graph Search Algorithms using STL:

Following is an easy implementation of BFS using STL in c++:

typedef vector<int> vi;
typedef vector<vi> vvi;
typedef pair<int,int> ii;
#define pb push_back
#define tr(c,i) for(typeof((c).begin() i = (c).begin(); i != (c).end(); i++)
#define all(c) (c).begin(),(c).end()

vvi graph;
// start_vertex is the starting vertex.
// n is the number of Nodes
int bfs(int start_vertex, int n){
vi visited(n, 0);
queue<int> q;
visited[start_vertex] = 1;
q.push(start_vertex);
while(!Q.empty()){
int idx = q.front();
q.pop();
tr(graph[idx], itr){
if(!visited[*itr]) {
q.push(*itr);
visited[*itr] = 1;
}
}
}
return (find(all(V), 0) == V.end());
}


## October 21, 2015

### Aman Singh(Scikit-image)

#### Different versions of Binary Search:

Many a times we face the need to modify binary search for solving different Competitive Coding problems. I am archiving some of the most used variations of Binary Search:

• Normal Binary search with two comparisons:

int BinarySearch(int A[], int l, int r, int key)
{
int m;
while( l <= r )
{
m = l + (r-l)/2;

if( A[m] == key ) // first comparison
return m;

if( A[m] < key ) // second comparison
l = m + 1;
else
r = m - 1;
}
return -1;
}
• Binary Search with less comarisons:

// Input: A[l .... r-1]
// Note A[r] is not being searched
int BinarySearch(int A[], int l, int r, int key)
{
int m;
while( r - l > 1 )
{
m = l + (r-l)/2;

if( A[m] <= key )
l = m;
else
r = m;
}

if( A[l] == key )
return l;
else
return -1;
}

• Binary search used to find the floor value in code:
Note: If there are duplicates, this will return the index of the last occurrence of key.

//  Eg.The code will return 3 if searched for 4
//  in array [1,2,3,5,6]
//  Input: A[l .... r-1]
//  Note A[r] is not being searched
int Floor(int A[], int l, int r, int key)
{
int m;
while( r - l > 1 )
{
m = l + (r - l)/2;
if( A[m] <= key )
l = m;
else
r = m;
}
return A[l];
}

// Initial call
int Floor(int A[], int size, int key)
{
// Add error checking if key < A[0]
if( key < A[0] )
return -1;
// Observe boundaries
return Floor(A, 0, size, key);
}

• Finding number of occurrences of a Number in a sorted array:

// Input: Indices Range [l ... r)
// Invariant: A[l] <= key and A[r] > key
int GetRightPosition(int A[], int l, int r, int key)
{
int m;

while( r - l > 1 )
{
m = l + (r - l)/2;

if( A[m] <= key )
l = m;
else
r = m;
}

return l;
}

// Input: Indices Range (l ... r]
// Invariant: A[r] >= key and A[l] > key
int GetLeftPosition(int A[], int l, int r, int key)
{
int m;

while( r - l > 1 )
{
m = l + (r - l)/2;

if( A[m] >= key )
r = m;
else
l = m;
}

return r;
}

int CountOccurances(int A[], int size, int key)
{
// Observe boundary conditions
int left = GetLeftPosition(A, -1, size-1, key);
int right = GetRightPosition(A, 0, size, key);

// What if the element doesn't exists in the array?
// The checks helps to trace that element exists
return (A[left] == key && key == A[right])?
(right - left + 1) : 0;
}
• Apart from these functions there are also direct functions in stl which can be used. These are lower_bound and upper_bound.
lower_bound gives the iterator of the first appearance of key in array if key is in array, else it returns the iterator of the element just less than key. Similarly upper_bound returns the iterator of the next larger element than key in array in all cases. The example will make it clearer.

#include <iostream>     // std::cout
#include <algorithm>    // std::lower_bound, std::upper_bound, std::sort
#include <vector>       // std::vector

int main () {
int myints[] = {10,20,30,30,20,10,10,20};
std::vector<int> v(myints,myints+8);           // 10 20 30 30 20 10 10 20

std::sort (v.begin(), v.end());                // 10 10 10 20 20 20 30 30

std::vector<int>::iterator low,up;
low=std::lower_bound (v.begin(), v.end(), 20); //          ^
up= std::upper_bound (v.begin(), v.end(), 20); //                   ^

std::cout << "lower_bound at position " << (low- v.begin()) << '\n';
std::cout << "upper_bound at position " << (up - v.begin()) << '\n';

return 0;
}

### The excitement

People travelling from all over the country(and outside!) to Bangalore for a conference on a weekend, Yay!
We were really excited about the workshop and devsprint that the SymPy team was about to deliver. More so excited we were about the fact that we will finally be meeting one another.

### Day 0

#### DevSprint

The first day of the conference kicked off with the devsprints. That morning the whole team met up, present there were Harsh, Sudhanshu, AMiT, Sartaj, Shivam and Sumith . Abinash couldn't make it but he was there in spirit :)
We all got our awesome SymPy tees and stickers, thanks to AMiT.
Having got alloted mentoring space in the devsprint, basic introduction of SymPy was given by Sumith. Some interesting mentoring spaces were CPython by Kushal Das, Data Science by Bargava. The whole list is here
We got the participants started off with setting up the development workflow of SymPy and then they started working on the internals. We alloted bugs to many and directed them to the solution. Sadly, not many issues could alloted or closed due to the really poor internet connection at the conference hall but it was cool interacting with the enthusiasts. We also happened to meet Saurabh Jha, a contributor to SymPy who had worked on Linear Algebra and he helped us out with the devsprint.

#### Workshop

The workshops ran in two and a half hour slot. This was conducted by Harsh, Sudhanshu, AMiT and Sumith.
Sumith started off with introduction to SymPy. Then we spent some helping everyone setup their systems with SymPy and IPython notebooks, even though prior instructions were given, we had to do this so as to get everyone on level ground.

Harsh took first half of the content and exercises
Sudhanshu took the second half, while AMiT and Sumith were helping out the participants with their queries.

We distributed t-shirts to all the participants at the end. Thanks to all those who participated, we had an awesome time.

Day 0 ended with all of us wrapping off the devsprint.
After having dinner together, everybody headed back looking forward to the coming two days of the conference.

### Day 1

Day 1 started off with a keynote by Dr Ajith Kumar B.P followed by multiple talks and lightning talks.
More interesting than the scheduled talks were the conversations that we had with people present in the conference. Exchanging views, discussing on a common point of interest was surely one of the best experience that I had.

#### Lightning talk

Shivam delivered a lightning talk titled Python can be fast. Here, he stressed on the fact that implementing correct data structures is important and Python is not always to be blamed. He gave relevant examples from his summers work at SymPy.

By this point, we had reached considerable audience in the conference and lot of them were really interested in SymPy. We had a lot of younger participants who were enthusiastic about SymPy as it participates in GSoC, some of them also sent in patches.

### Day 2

Day 2 started off with a keynote by Nicholas H.Tollervey.

#### Talk

Sumith delivered a talk titled SymEngine: The future fast core of computer algebra systems. The content included SymPy, SymEngine and the interface. Some light was shed on Python wrappers to C++ code. Thanks to all the audience present there.

As the day was closing in, Harsh and Shivam had to leave to catch their flights.

#### Open Space

After multiple people requesting to help them get started with SymPy, we decided to conduct an open space.
Open spaces are a way for people to come together to talk about topics, ideas or whatever they want. All people had to do is just show up :) Present there were Sudhanshu, Sartaj, AMiT and Sumith. Sartaj luckily came up with a solveset bug. We had a live show of how bug-fixing is done. Filing an issue, fixing the code, writing tests and sending in a PR was all demonstrated.

### Closing thoughts

Conferences are the perfect place to discuss and share knowledge and ideas. The people present there were experts in their area of interests and conversations with them is a cool experience. Meeting the team was something that we were looking forward right from the start.

Missing Sartaj and Abinash

Discussing SymPy and the gossips in person is a different experience altogether. I'll make sure to attend all the conference that I possibly can from hereon.

Be back for more

## October 19, 2015

### Vipul Sharma(MoinMoin)

#### Summrizer: Text summarizer (Introduction)

EDIT: Completed the project :) see here

I had started working on this project 6-7 months ago. I left it mid-way as I got busy with something else but, now I am again onto this :D. The plan was to create something insanely awesome but, then I recalled few words; someone told me once that first one should create a Minimum Viable Product and then go for more features.

Currently, I am working on a pretty naive text summarizer by implementing a basic text scoring algorithm with some use of NLTK. Although, I've worked with Stanford's CoreNLP earlier, I wanted to exploit the power of NLTK.

I've tested the script by summarizing some articles from techcrunch.com and compared the summary results with results from some online text summarizing websites like:

After this basic implementation works fine, I'll try to implement some complex language processing concepts for which I may be dealing with more of NLTK or even CoreNLP (personally, I like Stanford's CoreNLP more).

Code: https://github.com/vipul-sharma20/summrizer

I've also created a separate branch and initialized it with a Django based web application. Once the script works fine, I'll try to host this script as a web application for text summarizing. But, my priority and focus is on creating a more efficient summarizing script :D

## October 07, 2015

### Himanshu Mishra(NetworkX)

#### GSoC '15 Progress: Second Report

Past couple weeks had been fun! Learnt many new and interesting things about Python.

The modified source code of METIS has got in, followed by its Cython wrappers. Thanks again to Yingchong Situ for all his legacy work. Nevertheless, things were not smooth and there were lots of hiccups and things to learn.

One of the modules in the package was named types which was being imported by absolute import. Unknown of the fact that types is also a built-in module of python, the situation was a mystery for me. Thanks to iPython which told me this
In [2]: types                                                           Out[2]: <module 'types' from '/usr/lib/python2.7/types.pyc'>

This alerted me for all the difference, pros and cons of absolute and relative import. Now one may ask (Does anyone read these blog posts?) why didn't I go with the following at the very first place.

In [3]: from . import types

Actually networkx-metis is supposed to be installed as a namespace package in networkx, and the presence of __init__.py is prohibited in a namespace package. Hence from . import types would raise a Relative import from a non-package error.

We are now following the Google's style guide for Python[1].

Being licensed under Apache License Version 2, we also had to issue a file named NOTICE clearly stating the modifications we did to the library networkx-metis is a derivative work of.

Next important items in my TODO list are
Finalizing everything according for namespace packagingSetting up Travis CIHosting docs over readthedocs.org
That's all for now.

Happy Coding!

[1] https://google-styleguide.googlecode.com/svn/trunk/pyguide.html

## September 11, 2015

### Vipul Sharma(MoinMoin)

#### Writing training mission for Openhatch

I am grateful to this wonderful community who introduced me to the world of Open Source and I owe them a lot for my success in GSoC, 2015. I learned a lot of new things by interacting with this awesome community.

In the beginning of my journey to Open Source, I learned about few concepts through the training missions of Openhatch (http://openhatch.org/missions/) and now I want to share what I learned in the past 3 months by writing a new training mission for Openhatch. I had written a training mission on "Using the shell" earlier in February this year (http://openhatch.org/missions/shell/about) and now I am looking forward to write a training mission on: using the version control system, mercurial (hg). I am myself a newbie to mercurial and have learned just a little bit during my GSoC project this year. But, still I'll try to write a good mission by discussing it with the community and will also take some inspiration from the existing missions on SVN and GIT which are quite exciting to follow.

I would love to hear if anyone has some cool ideas/suggestions for the training mission on mercurial, on how to make it more interactive and make a user more involved with the mission and checking their progress by defining some objectives or some small task which can test if they are learning right or not.

It has been quite some time since I've developed on Django, now I am very much excited to write some good stuff and contribute to Openhatch :D Will update the progress through blog posts.

I almost forgot to thank Sufjan ! it always helps me to write code ;) Say hi !

## September 09, 2015

### Siddharth Bhat(VisPy)

#### Freelancing - What have I gotten myself into?

I’ve been studying for my exams pretty heavily over the past few days. So, I decided to take a break to do something completely different - I’ve wanted to freelance for a while, since it seemed like a really nice way to learn new things (and solve real-world problems), while sipping coffee at home in your pyjamas.

With this sufficiently rose-tinted picture in mind, I began scouting for websites. After all, how hard can it be to create a profile and get started? Turns out, the corner of the web that deals with all things “work” is stuck in 20th century, design and all.

What I found stunning was the honestly terrible interface of most of these websites. Freelancer seems to be a very highly rated website, but has more micro-transactions than a typical EA game.

Upwork seemed promising, until their website borked at a lot of interactions and gave me a nice (error: 0) that I’m still not sure how to interpret.

Both of these left me with a sour taste in my mouth, along with a couple of others I won’t bother mentioning here.

I was disappointed by the fact that none of the websites seemed to be a “by programmers, for programmers” sort of thing. Programming was something that was lumped in with design, creative writing, testing, what-have-you. This felt like a disservice to all of those fields.

I don’t think that I’m that hard to please. All I wanted was:

• A clean, functional UI
• Feel like it’s programmers first
• Ask for programming experience, maybe a field to fill up open source contributions

I had half-resigned myself to the fact that what I was looking for was simply a pipe-dream. You really start to question yourself when you hit Google’s next page button, and still don’t find “the one” you’re looking for.

## The saving grace - Toptal

A lucky stumble upon a Quora link about freelancing pointed me to Toptal.

Holy crap do they know what they were doing.

The website is clean and functional. They have an honest-to-god interview process. The took a resume, fields of interest, programming experience, Github username, as well as sample code (I’ll be honest, this pushed me from lukewarm to in-love territory).

I filled up their application form. Their next step is the problem solving and interview rounds, which I’m totally looking forward to. This seems to be a tech startup that knows what they’re doing, for both customers and and developers.

All in all, I’m excited about the opportunity to work on challenging and interesting projects, thanks to Toptal. It looks like they’re succeeding where most others have failed - make freelancing for developers a painless experience.

They seem promising with what I’ve seen of them so far, and I hope that the experience continues to be great. If this works out, It’ll make me a happy person.

I do believe they need to look into their search engine optimisation though. They didn’t come up when I made a generic “freelance programming” search on Google. Of course, my location (India) could have something to do with that. I’d still look into it if I were them.

#### Full Disclosure

Write a blog post to get priority access.

That was part of the reason why I wrote this. The other part was legitimately wanting to whine about the sorry state of affairs that I encountered with the freelancing scene in general.

While my blog post might seem fanboy-ish, it’s purely honest opinion. I’ve liked what I’ve seen so far from Toptal, So I decided I might as well chronicle my foray into the scene. After all, I might look back at this blog post 5 years later and laugh. I know I certainly do so with my diaries. Only time will tell!

Keep the fingers crossed for me!

### Goran Cetusic(GNS3)

#### Docker in network emulators

Recently, GNS3, a popular network emulator has been developing support for Docker as one of its endpoint devices. What they call endpoint devices are actually VirtualBox and VMWare VMs and now Docker containers. It also supports other types of virtual nodes like switches and routers that are actually Cisco IOS images. This is in contrast to IMUNES which uses Docker to emulate both what they have as equivalent to GNS3 endpoints (PC and Host nodes) and switches and routers by configuring a Docker container based on the type of device.

Regardless of the type of device you're trying to emulate, using some kind of generic virtualization technology in network emulators enables users to choose which software is available on their network nodes. True, you can't run Cisco IOS on those nodes but you can run software like Nginx, Gunicorn, Apache, MongoDB, PostgreSQL and others in a controlled networked environment. Using Docker as the underlying technology makes sense because it efficiently uses the host resources that enable network emulators to create a hundred network nodes in under a minute while still providing the flexibility to set up your own brand of virtual node. This is different from full virtualization like the one VMWare and VirtualBox provide and similar to what LXC does. I'm not going into details on how full and kernel-level virtualizations work or the difference between LXC and Docker. This post is about Docker. So first, a little introduction to Docker and how it works.

### Linux namespaces

Docker is a lightweight virtualization technology that uses Linux namespaces to isolate resources between one another. Linux provides the following namespaces:

       Namespace   Isolates       IPC         System V IPC, POSIX message queues       Network     Network devices, stacks, ports, etc.       Mount       Mount points       PID         Process IDs       User        User and group IDs       UTS         Hostname and NIS domain name

Why several namespaces? Because this way Linux can finely tune which processes can access which resources. For example, if you run two processes like Apache and PostgreSQL and they have different network namespaces, they won't see the same interfaces. However, since no one told them to use their own mount namespace, they still see the same mount points. This can be useful, you generally want all processes to see the same root disk but not other resources. Tools like Docker and LXC do a pretty good job putting it all together so you get what looks like lightweight virtual machines. Because it's done inside the same kernel it's lightning fast. However, this limits us to using the same kernel. With Linux namespaces you can only create Linux VMs while with VirtualBox you get full virtualization. Using Docker is not that much of a restriction in network emulators because it's designed for Linux which provides a wide range of network software. Also, with boot2docker (and Docker Machine) for Mac OS X and the recent port to FreeBSD, the list of OS restrictions becomes smaller and smaller.
Namespaces can get a bit technical and if you're interested in how they work, here's a few articles to help you start:

### Docker images

The collection of resources and namespaces Docker puts together and that act like VMs are called containers. That's why it's called Docker I guess. Anyway, to start multiple containers Docker uses templates called images. These templates generally don't have any binaries directly included but provide instructions on which software packages to download, scripts to run, make some configurations and so on. Then, based on those instructions listed inside what is called a Dockerfile generate the image. Of course, you can do pretty much whatever you want inside the Dockerfile so if you have any scripts or binaries that you want to include inside the image, you can ship it.

Much like repositories on Github, Docker has something called Docker Hub where various images are stored. Organizations like PostgreSQL, Redis, Fedora and Debian all have their own images hosted there that make it easy to quickly start a Docker container with their software installed. There's a whole bunch of sites offering documentation on how to install Docker, create Docker images, push them to Hub and start containers from those images but here's a short recipe for Fedora 22:

[cetko@nerevar ~]$sudo dnf install docker [cetko@nerevar ~]$ sudo systemctl start docker
[cetko@nerevar ~]$docker pull debian latest: Pulling from debian 902b87aaaec9: Already exists 9a61b6b1315e: Already exists Digest: sha256:b42e664a14b4ed96f7891103a6d0002a6ae2f09f5380c3171566bc5b6446a8ce Status: Downloaded newer image for debian:latest [cetko@nerevar ~]$ docker run -it debian bash
root@9695859bac69:/#

And that's it! With the last command you've run the Bash shell inside a containerized Debian Jessie installation. You're all set up to use any of the multitude of Docker images hosted on Hub. Now, what if we wanted to create a Docker image but with specific network tools available on Linux and use it as a network node inside an emulator? Think it can't be done? There's already at least two such repositories on Hub, both based on Debian Jessie and share much of the same setup, one used in GNS3 and the other in IMUNES:
1. https://hub.docker.com/r/gns3/dockervm/
2. https://hub.docker.com/r/imunes/vroot/
They're actually automated builds created from instructions in Dockerfiles (Docker image configuration files) hosted in Github repositories. This is actually common practice: users push their Docker images to Hub and save Dockerfiles to Github so users can fork repos and build their own images. Give it a try!
Now, there's a lot more to building a network emulator that just downloading a Docker image that fits in nicely in network emulators so stay tuned for more posts of my "Building a network emulator" series.

## September 08, 2015

### Goran Cetusic(GNS3)

#### Generic nodes in network emulators

As invaluable tools in networked and distributed systems research, network emulators and simulators offer a viable alternative to live experimental networks. In short, it's easier to draw a network using software and test it than to use real hardware. But before we start, let's go through one crucial point that will help us understand what we're trying to achieve.

Simulators vs Emulators
All tools that provide a testbed for networking scenarios fall into two categories: emulators and simulators.
Here's what Wikipedia says about the difference between those two:
Emulation differs from simulation in that a network emulator appears to be a network; end-systems such as computers can be attached to the emulator and will behave as if they are attached to a network. A network emulator emulates the network which connects end-systems, not the end-systems themselves.Network simulators are typically programs which run on a single computer, take an abstract description of the network traffic (such as a flow arrival process) and yield performance statistics (such as buffer occupancy as a function of time).
Pure network simulators are mostly used in theoretical network research like new protocol algorithms. For example ns-3 is an open source network simulator. That is, users can develop their own network protocol and simulate performance. It also has an animation tool to visually observe how their network operates and is used primarily in academic circles for research and education. GNS3, one of the most popular network tools began as a Cisco emulation tool, but has grown into a multi-vendor network emulator. Every node in GNS3 is represents some network device, be it a switch in the form of a Cisco IOS image or a PC using some form of virtualization technology like VirtualBox or VMware. However, it's important to note that these nodes emulate real network devices.  As Wikipedia said, users can be attached to the emulator just like to a real network. So regardless of what GNS3 calls itself it's actually an emulator and a very useful tool for network administrators that design real networks with existing protocols, not experiments with network protocols that help write research papers.
Some network tools call themselves emulators but are simulators but most tools that call themselves simulators are actually emulators. Another example is IMUNES (Integrated Multiprotocol Network Emulator/Simulator) which can't decide if it's an emulator or a simulator. It's an emulator. One good reason for calling an emulator a simulator is that users generally tend to google "network simulator" more often than "network emulator" so it's harder to get googled if you're called an emulator.

The best network emulators generally have graphical interfaces that let users drag and drop network components into a sandbox to set up a network environment and then run experiments on those environments. Usually, those components (network nodes in the sandbox) represent your off-the-shelf network equipment: switches, routers, PCs etc. Different emulators use different techniques to implement those components. GNS3 uses a combination of Cisco IOS images for switches an routers and various virtualization technologies (VMware, VirtualBox, Docker, VPCS) to implement PCs. IMUNES uses Jails on FreeBSD and Docker on Linux to emulate all of its nodes. Basically, every node in IMUNES is either a FreeBSD or a Linux virtual machine, depending on which OS you're running it. So a router in IMUNES is actually a VM with a mainstream OS and special setup (e.g. some kind of routing software).

Both IMUNES and GNS3 virtual machines can be anything the OS they run supports. If someone asks what is a good general-purpose OS to use for networking the answer will in most cases be Linux. Indeed, Linux is a good choice for a generic network node used in emulators. Let's simplify things and temporarily assume we want to build a network emulator that only uses virtualized Linux machines as nodes because we can set up a variety of network services on those machines like Apache, Nginx, dhcpd and Nginx. We couldn't do that if we only used components like Cisco routers. Sure, we could test a proprietary network protocol but we couldn't really stress test our new Gunicorn+Nginx setup. So generic nodes do have their uses in the world of network emulators. In the next part we'll talk about how to efficiently create such generic network nodes using some popular virtualization technologies.

## September 05, 2015

### Vipul Sharma(MoinMoin)

#### GSoC 2015: Code Submission

Passed the final evaluations :) Thanks to my mentors: Thomas and Saurabh for their support and guidance. I learned a lot about open source and moin-2.0 in these 3 months of coding period. I would like to congratulate all the 916 students who passed the final evaluations (as per the official blog).
I've also uploaded the code sample, my changesets which contains my work which I did during GSoC 2015 Coding Period.
Here is the link to the code sample
My bitbucket repository: moin-2.0

## September 01, 2015

### Andres Vargas Gonzalez(Kivy)

#### ggplot for python calling kivy matplotlib backend

Based on the grammar of graphics, ggplot is a library for plotting graphs in R. By doing some readings seems like ggplot is a very good choice to produce multi-layered graphs. I found a package for python which provides a ggplot structure wrapped into matplotlib instructions. I gave it a try by installing it from source https://github.com/yhat/ggplot and then, since the library is heavily depending on matplotlib I changed the backend by default to use kivy by placing these two lines on top of each example tested:

import matplotlib
matplotlib.use('module://kivy.garden.matplotlib.backend_kivy')



The main advantage in my case is the minimum set of instructions required for creating plots. Some resources can be found here http://blog.yhathq.com/posts/ggplot-for-python.html. I have not fully tested it but I am giving it a try.

#### Matplotlib backend kivy in Windows hands on

The backend implementations for matplotlib in kivy work just with kivy versions greater or equal than 1.9.1. It is required to install kivy in development mode following these steps: http://kivy.org/docs/installation/installation-windows.html#use-development-kivy. After step 10 you can additionally run

python setup.py install



The script kivy.bat creates an environment in which some python tools are ready to use, for instance pip. The next step is to download numpy and matplotlib from this binaries repositories: http://www.lfd.uci.edu/~gohlke/pythonlibs/#numpy and http://www.lfd.uci.edu/~gohlke/pythonlibs/#matplotlib. Once downloaded install both using pip and then install kivy-garden and the matplotlib package for kivy.

pip install "numpy-1.10.0b1+mkl-cp27-none-win32.whl"
pip install "matplotlib-1.4.3-cp27-none-win32.whl"
pip install kivy-garden
garden install matplotlib



You can go now to the folder that by default garden packages are installed. In my case C:\Users\my_user\.kivy\garden\

cd garden.matplotlib/examples
python test_plt.py



And you should be able to see the matplotlib widget working on windows. Follow the steps from http://kivy.org/docs/installation/installation-windows.html#send-to-method to create a shortcut to execute your kivy apps.

Now you can go directly to the garden path using a file explorer, do right click on test_plt.py and send to kivy-2.7.bat which will start the application.

## August 28, 2015

### Ambar Mehrotra(ERAS Project)

#### GSoC 2015: Concluding Blog Post

Hello everyone, GSoC 2015 has finally come to an end. This journey which began around 3 months ago finally finished on 22nd of August and I am glad that I could be a part of it.
I learned a lot of things during the past 3 months and also had a wonderful opportunity to be a part of the project which might be used by future astronauts. :)

All the work that I did can be found in the ERAS repository under the "habitat_monitor" section. My aim was to develop a top level monitoring interface for the V-ERAS project which could be used to monitor the entire habitat.
I have been able to achieve more or less what was required with the exception of PANIC integration. This was because I couldn't get the PANIC API to work even after a lot of tries and also couldn't get much help from their forum.
I will continue to work on this project in the future and improve it as and when required. I would also love to remain associated with the ERAS project and work with all the people who are involved with it.

I would like to thank my mentor for supporting me in harsh times and helping me throughout. I would also take the opportunity to thank all the other people associated with this programme who helped me out. I would also like to thank Google for providing such a wonderful opportunity and all my friends and family for their support.

Cheers :)

### AMiT Kumar(Sympy)

#### GSoC : Throughout in SymPy # Wrap Up

Hi! I am Amit Kumar (@aktech), a final year undergraduate student of Mathematics & Computing at Delhi Technological University. This post summarizes my experience working on GSoC Project on Improving Solvers in SymPy.

## Introduction

I first stumbled upon SymPy last year, while looking for some Open Source Computer Algebra Systems to contribute. I didn't had any Open Source experience by then, So SymPy was an Ideal Choice for getting into the beautiful world of Open Source. I wasn't even Proficient in Python so at first it was little difficult for me, but Thanks! to the beauty of the language itself, which makes anyone comfortable with it in no time. Soon, I decided to participate into Google Summer of Code under SymPy. Though at this point of time, I didn't had decided about the project, I would like to work in Summers.

##### First Contribution

I started learning the codebase & made my first contribution by Fixing an EasyToFix bug in solvers.py through the PR #8647, Thanks to @smichr for helping me making my first ever open source contribution. After my first PR, I started looking for more things to work and improve upon and I started commiting quite often. During this period I learnt the basics of Git, which is one of the most important tools for contributing to Open Source.

## Project Ideas

When I got a bit comfortable with the basics of SymPy & contributing to open source in general, I decided to chose an area (module) to concentrate on. The modules I was interested in, were Solvers and Integrals, I was literally amazed by the capability of a CAS to integrate and solve equations. I decided to work on one of these in the summers. There was already some work done on the Integrals module in 2013, which was yet to be Merged. I wasn't well versed about the Manuel Bronsteins works on Methods of Integration in a Computer Algebra System, so I was little skeptical about working on Integrals. The Solvers module attracted me due it's awesome capabilities, I found it one of the most useful features of any Computer Algebra Systems, So I finally decided to work on Solvers Module.

## Coding

I was finally accepted to work on Solvers this summer. I had my exams during the community bonding period, So I started almost in the first week of Coding Period. I made a detailed timeline of my work in summers, but through my experience I can say that's seldom useful. Since, you never know what may come out in between you and your schedule. As an instance PR #9540, was a stumbling block in lot of my work, which was necessary to fix for proceeding ahead.

#### Phase I (Before Mid Terms)

When coding period commenced, I started implementing the linsolve, the linear system solver which is tolerant to different input forms & can solve almost all forms of linear systems. At the start I got lot of reviews from Jason and Harsh, regarding improvement of the function. One of the most important thing I learnt which they focused on was Test Driven Development, they suggested me to write extensive tests before implementing the logic, which helps in reducing the problems in visualizing the final implementaion of the function and avoids API changes.

After linsolve I implemented ComplexPlane, which is basically Complex Sets. It is useful for representing infinite solutions in argand plane. While implementing this I learnt that chosing the right API is one of the most important factors while designing aa important functionality. To know more about it, see my blog post here. During this period I also worked on fixing Intersection's of FiniteSet with symbolic elements, which was a stumbling block.

#### Phase II (After Mid Terms)

After successfully passing the Mid Terms, I started working more on robustness of solveset, Thanks to @hargup for pointing out the motivation for this work. The idea is to tell the user about the domain of solution returned. Simplest motivation was the solution of the equation |x| - n, for more info see my blog post here. I also worked on various trivial and non trivial bugs which were more or less blocking my work.

Then I started replacing solve with solveset in the codebase, the idea was to make a smooth transition between solve and solveset, while doing this Jason pointed out that I should not remove solve tests, which can make solve vunerable to break, So I reverted removing of solve tests. Later we decided to add domain argument to solveset, which would help the user in easily dictating to solveset about what solutions they are interested in, thanks to @shivamvats for doing this in a PR. After the decision of adding domain argument, Harsh figured out that, as of now solveset is vunerable to API changes, so it's not the right time to replace solve with solveset, so we decided to halt this work, as a result I closed my several PR's unmerged.

I also worked on Implementing Differential Calculus Method such as is_increasing etc, which is also Merged now. Meanwhile I have been working on documenting solveset, because a lot of people don't know what we are doing & why we are doing, so It's very important to answer all those subtle questions which may come up in there mind, So we decided to create a FAQ style documentation of solveset see PR #9500. This is almost done, some polishing is needed. It would be Merged soon.

During this period apart from my work, there are some other works as well which is worth mentioning, one of them is ConditionSet by Harsh which serves the purpose of unevaluated solve object and even much more than that for our future endeavours with solveset. Others being codomain & not_empty by Gaurav @gxyd which are also important additions to SymPy.

TODO: Probably, this will need a comprehensive post, I would write soon.

## Future Plans

Recently Harsh came up with an idea of tree based solver. Since now ConditionSet has been introduced, the solving of equations can be seen as set transformation, We can do the following things to solve equations (abstract View):

• Apply Various Set Transformations on the given Set.
• Define a Metric of the usability or define a notion of better solution over others.
• Different Transformation would be the nodes of the tree.

As a part of this I worked on implementing a general decomposition function decompogen in PR #9831, It's almost done, will be merged soon.

I plan for a long term association with SymPy, I take the full responsibilty of my code. I will try to contribute as much as I can particularly in sets and solvers module.

## Conclusion

On a concluding note, I must say that getting the opportunity to work on SymPy this summer has been one of the best things that could happen to me. Thanks to Harsh for helping me all my endeavour, also for being one of the best mentors I could get. I would like to thank Sean as well who from his busy schedule took up the time to attend meetings, hangouts and for doing code reviews. Also thanks to Chris Smith who is the most gentle and helpful person I have ever seen, he is one of the reasons I started contributing to SymPy. Thanks to Aaron, Ondrej, and last but not the least my fellow GSoCer's at SymPy leosartaj, debugger22, sumith1896, shivamvats, abinashmeher999. Special Thanks to whole SymPy team and Community for a wonderful collaboration experience. Kudos!

## August 27, 2015

### Mark Wronkiewicz(MNE-Python)

#### Wrapping Up / Winding Down

C-day + ~10 weeks

By the end of the project, I finished or almost finished all of the major improvements for the signal space separation (SSS) algorithm. The initial SSS implementation required about six weeks to understand and then implement to a degree that matched the proprietary version of the software. Other than the fundamental algorithm, there are three major “expansion packs” that make the algorithm more attractive for real-world noise rejection. The first is the ability to reconstruct bad channels; since we are oversampling the spherical harmonic bases, we can actually reconstruct the signals at bad sensors where we were forced to throw out that sensor’s data. There are two other extensions – fine calibration and temporospatial SSS (tSSS) – that are both underway but not yet complete. As described before, channel reconstruction, fine calibration, and tSSS are all methods to remove more noise from the MEG signals. Even beyond these published algorithms, we have improvements included in our fine calibration procedure beyond the proprietary SSS algorithm. Of course, I’ll keep working on these extensions until they are finished so that our library has a powerful and complete implementation of the SSS functionality.

Now that the GSoC project is over, I’m glad I followed a post-doc’s advice to use the program as a chance to explore concepts outside my thesis area that manifest in code that’s helpful to our source imaging community. There were a number of technical skills I now have that I didn’t have before and probably wouldn’t have gained without this opportunity. I’m much more familiar with the concepts behind Git and Github as well as a better collaborator having worked with a number of MNE-Python’s other coders. To understand and implement the SSS algorithm, I also had to (heavily) brush up my physics and, in particular, Maxwell’s equations and spherical harmonics. The fact that I had a practical application of these concepts served as a great carrot to stay motivated as much of the necessary mathematical concepts were in impressively dense papers from before the 1980s. At the very least, I'll have one fancy physics concept to draw on a cocktail napkin many years from now.

Schools out for the (remaining) summer.

## My proposal

I proposed to work on the API which will allow specifications of "Body and Joints". It was intented to be a layer above Sympy mechanics module with numeric as well as symbolic part working together. I also proposed to provide functions to add points and velocities approach to define system too along with the default one.

#### GSoC! A Journey Summarised.

GSoC ended on 21st and final evaluation ends tomorrow. This may look like an end but I can assure you its not. GSoC celebrates the spirit of open source and give an opportunity to be a part of it. Honestly, I started contributing after I got to know about GSoC. My seniors at the college were already doing it. That was enough to get me started.

### Richard Plangger(PyPy)

#### The End of Summer, PyPy ♥ SIMD

The Summer of Code is approaching its end and it has been an amazing experience for me. Not only this, but I also approach the end of my masters thesis which for me was on of the main goals for the last five years. The sad story is that I most likely are not able to participate as a student in the future.

It has been a really great time for me, and for anyone feeling unsure of applying or not, I can warmly recommend to try.

PyPy's vectorizing optimizer

So what is the outcome of my GSoC? Let's quickly have a look on what I proposed:

The goal of this project is to enhance the trace optimizer of PyPy. By definition NumPy arrays are unboxed, homogeneous (one data type) and continuous in memory (despite certain exceptions). The same is true for the array module in the standard library. The new optimizer will use this opportunity and exploit a SIMD instruction set (such as SSE,AVX) present on modern CISC processors (e.g. x86). This should lead to enhanced execution speed for arithmetic intensive applications that run on top of PyPy.

I have already showed that individual traces get faster when the optimization is turned on. But that does not necessarily mean programs get faster. The optimization only helps if your program spends a significant fraction of time in the trace loop that is vectorizable.

The following shows the basic NumPy operations stressed. In a loop the NumPy operations are executed 1000 times. This sample program shows the basic setup of multiply-float64:

def bench (vector_a, vector_b):
for i in range(1000):
numpy.multiply(vector_a, vector_b, out=vector_a)

The speedup (bigger is better) show the theoretical maximum speedup (this is bounded because of the way SIMD works) and what is achieved by the optimization. The base line is the portable PyPy version 2.6.0 (none of the code I changed is included in this version). The version of Vector speedup is a026d96015e4.

Considering that currently aligned memory load cannot be used I think this is a pretty good result. For float64 multiply the maximum speedup is nearly reached. float32 performs very poorly because the JIT currently does not operate on float32, but always casts to float64, executes the arithmetic and casts back to float32.

Let's run some real programs. SOM (Self Organizing Maps) is an algorithm to map N dimensions onto a 2d grid. It contains a lot of vector distances and vector operations. Dot is the matrix dot product and wdist is the weighted euclidean distances. The number in the benchmark name indicates the size of the vector register.

As you can see, bigger vectors are better for the optimization. The cool thing is that PyPy is now able connect trace trees that use SIMD operations which is not possible if the numerical kernels are written in e.g. C. Again I think the results are pretty cool and speedup should get even more crazy if AVX is implemented in the JIT backend.

Non NumPy traces

At the beginning it was a bit unclear if this is possible, but let me show you a very basic example of a Python program where the optimization creates a very good piece of assembler.

# arrayloop.py
import array
w = array.array('d',[0]*10000)
l = array.array('d',[1]*10000)
def f():
i = 0
while i < 10000:
l[i] = l[i] + w[i]
i += 1

for i in range(100000):
f()

$time ./pypy-c --jit vec_all=1 ~/arrayloop.py 0.67user 0.01system 0:00.69elapsed 99%CPU$ time ./pypy-c --jit vec_all=0 ~/arrayloop.py
1.65user 0.01system 0:01.67elapsed 99%CPU

Boom. 1.65 / 0.67 = ~2.45 times faster. This is not super scalar because of some caching issues, but because it does less guard checking (all array bound checks are only checked once).

If you happen to be into PyPy, you know that the list strategy (e.g. [42.0] * 44) will not store Python objects, but store double floating points directly in memory. When the first non floating point value is stored into the array, it is transformed into a list of Python objects. Perfect! This makes it even possible to let the optimization run on loops that manipulate lists.

Unfortunately this is currently not possible, because there are some fragments in the trace that the optimizer cannot transform. We are working on it and the next major PyPy update might contain this as well!

Stats

It spans over at least seven files and needs about > 3000 lines of code. The test suite covers roughly > 4000 lines in five files. This is the newly added code in grand total. It excludes every line that I have changed in the existing source base.

Does this only work for PyPy?

If I have put the same effort into a different virtual machine, it would have been tied to the language and the virtual machine only.

That would be lame right?

But (you might be aware of this) any language interpreter written in RPython can reuse the tracing JIT compiler. That means that magically'' this optimization is added to the final executable by default.

At this point it should be said that programs will not automatically get faster, there are some rules your traces must obey. But this is easy. Much more easier than writing assembler or using compiler intrinsics.

This is great because if you want to write an array processing language in RPython, you gain all the features it already provides and get to use SSE4.1 (for now) to speed up your programs

Final words

I plan to further work on PyPy and I think soon this project will be finally merged into default. We have talked about it to include it in the upcomming 2.6.1 release, but there are other greate changes comming to enhance the trace compiler it was postponed.

Thank you PyPy Team, thank you Google & PSF! I had a great time tinkering on my proposal.

### Aron Barreira Bordin(Kivy)

#### Kivy Designer - Python Mobile made easy :)

Hi!

This is the final report of Kivy Designer development, and the end of GSoC :(

In this post I'll show you an overview of my original proposal, and compare with the final result.

## Expectations

Right now, major part of toolchains for Python Mobile still under development, and to new developers, it usually sounds confusing.

My goal with this project was to evolve Kivy Designer and makes it an IDE that organizes and help us to develop Python/Kivy applications targeting multiple platforms.

## Project Overview

I've made some small modifications to the original proposal. In my proposal, Kivy Designer should be integrated with Hanga. But due to some problems on Hanga, this part of the project was not developed.

I've made some important improvements to the Kivy Designer. I think that the main feature is the Builder. The Builder helps you to target the same project/source code to multiple platforms. So you can easily develop your app on your computer, and then deploy in on your mobile device :) Now the Builder supports Buildozer and the default Python interpreter to run on your computer; but it's ready to be integrated with new tools.

Some enhancements were made to help with the development itself. An important one is the Jedi integration. Now, Kivy Designer provides auto completion to Python source codes :) And custom themes to CodeInput :)

It's integrated with Kivy Modules, that helps your your see the app running in different screen sizes, dimensions, orientations, etc; debug it; and more.

And, sure, what is a project without a good control? Kivy Designer is now integrated with Git. You can easily uses git features inside Designer, work with remote repos, branches, etc.

## Progress Overview

I was able to complete my proposal and even add some extra features to the project; but this summer was not enough to release a complete and powerful IDE. We have a lot of new features, however, it stills a WIP. Unfortunately I had a different calendar at University this year. So I have been studying during the whole GSoC(today, August 27, is my last day with classes. I did my last test some hours ago, and starting my University vacations now :P ).

So I was not able to focus a lot of time on GSoC :(

I just feel that, with a better time, I should be delivering an even powerful project. But ...

## Is it the end ?

GSoC was an amazing and unique experience for me.

I had been walking around in the beginning of the year trying to find open source projects that I could help. And then, I read about GSoC.

I see GSoC as a bridge that helped me to connect and learn/study about a project/company. So, now I'm completely able to keep contributing with them, make my project even better, and have a good connection with the open source community.

Now I can say you that I'm really experienced with Kivy Designer, and I've a good understanding/experience with Kivy itself, so, let's keep playing :)

## August 26, 2015

### Andres Vargas Gonzalez(Kivy)

#### Matplotlib for Kivy Final Notes

As outcome of one of the projects from the google summer of code 2015 two packages were created and are available as kivy garden packages.

These two packages can be used separately or can be combined as shown in https://andnovar.wordpress.com/2015/08/06/corner-detection-on-strokes-and-strokes-annotations-on-matplotlib/.

Warning:

– If you are using dash_list attribute when creating a line in a figure, it will not work until the update is done in the next kivy version.
– The backend only works with the kivy >= 1.9.1. Some changes were done in the Widget class.
– Matplotlib backend has not been tested on android.

Both packages have been tested and debugged, however it is highly encouraged to submit any issue found while using them. Some of its advantages and weaknesses are exposed in the following paragraphs:

The first package is garden.matplotlib which is a matplotlib backend for kivy, it allows to create applications using pyplot instructions or to embed figures into a kivy application. There are two possible backends that can be invoked:

– backend_kivy which does the render using kivy graphics instructions.
– backend_kivyagg which does the render using an static image texture.

Both of the backends can be connected to the default matplotlib events https://andnovar.wordpress.com/2015/06/15/connecting-events-between-kivy-and-matplotlib/. Additionally another widget called NavigationToolbar can be instantiated and used with both backends, a better description can be found in https://andnovar.wordpress.com/2015/08/06/navigation-toolbar-with-matplotlib-events-connected/. A FigureCanvas can be used without a NavigationToolbar but a NavigationToolbar needs a FigureCanvas to be instantiated. The main advantage of this package is the ability to be added as another Kivy widget inside a Kivy application through the use of matplotlib instructions.

There are some features that were not implemented and the users should be aware of:

– backend_kivy and backend_kivyagg are non interactive backends (http://matplotlib.org/faq/usage_faq.html#what-is-interactive-mode)
– configure_subplots button in the NavigationToolbar is not implemented.

Known bugs:

– When zooming on an area with not enough graphics instructions the render is not properly done. https://github.com/andnovar/kivy/issues/72

In the same way there are some features implemented in backend_kivy that are not implemented in other graphics instructions based backends.

– clip path when rendering an image.
– draw_path_collection method for optimization.
– draw_markers for optimization.

We believe most of matplotlib current capabilities have been considered during this backend implementation. However, we encourage to submit any capability missed you may find.

### Manuel Jacob(PyPy)

#### Summary

At the beginning of the GSoC coding phase I gave a rough schedule. Shortly after it turned out that the schedule was unrealistic. I'll describe why in more detail.

## PyPy3 2.6.0 Release

The release was almost done, but then a big release blocker appeared: Windows support. The default branch already has many failing tests on Windows, but in the py3k branch it's even worse — to a point where it didn't make sense to do a Windows release. I started to set up a Windows developing environment to bring the Windows support to a usable state, but got very demotivated quickly, so I continued working on Python 3.3 support, instead of finishing the relese. A partial release is of course better than no release at all. The PyPy version implementing Python 2.7 will see the 2.6.1 release soon. Shortly after the release I'll do a PyPy3 release, but without Windows support unless someone helps bringing it to a usable state.

## Python 3.3 support

The Python 3.3 support is almost complete feature-wise. There are still many failing tests that test obscure corner cases and these are sometimes very hard to fix. Basically everything left after picking and fixing the easy tests. :) I'll continue to work on it, now that the GSoC period is over.

## Other work done

Python 3.x support is developed in two branches, "py3k" for general Python 3.x support, currently targeting Python 3.2, and "py3.3" for Python 3.3 support. Apart from that there is continuous development in the "default" branch, targeting Python 2.7, mainly consisting of performance improvements. In order to profit from that, we must regularly merge the default branch into the py3k and py3.3 branches. Fortunately, most improvements to the JIT compiler doesn't create merge conflicts because of PyPy's automatic JIT compiler generation, but there are still many merge conflicts, which are very time-consuming.

Tests are very important for PyPy. At the beginning of the coding phase I spent much time fixing all test failures that occur in the py3k branch. After that I tried to fix all PyPy3-specific bugs reported in the bug tracker. I fixed most of them, but new ones were added in the meantime.

## Final words

This year's GSoC wasn't running very smooth, but I think it helped a lot to bring forward PyPy's Python 3.x support. Before the coding phase, progress in the two Python 3 branches stagnated a bit because almost all low-hanging fruits were already taken, leaving over mainly brain twisters. Many of these are now solved. Almost all tests in the py3k branch are passing (except on Windows), and much fewer tests are failing in the py3.3 branch. I think this built a good foundation for further work on Python 3.x support.

### Goran Cetusic(GNS3)

#### GNS3 Docker support pending merge

So the experimental GNS3 support for Docker is done. I'm sure that it has a few bugs but that's why it's experimental. We're discussing what to do with the code, either merge them from their respective docker branches to unstable or ship it with the latest version of GNS3. Because I like to keep users in the loop here's the transcript of our discussion:
We have two options:
* wait that the unstable branch become the master branch at the 1.4 release and move docker branch as the new unstable branch
* merge docker support to unstable and add in settings an experimental flags to show this option.

I've synced the unstable branch with docker branch sometime last week so if nothing major has been going on in the last couple of days it should be mergeable.
We should do two things.
Basically, importing docker branch to master branch shouldn't cause any problems with the rest of the code IF GNS3 successfully runs on those machines. Please note that I haven't tested it on Windows or Mac since Docker support in those operating systems has different setups and I'm not sure how docker-py will behave on those systems. Docker-py attaches to a socket on Linux so my priority was to do it on Linux. As long as GNS3 starts and you can work with other VMs you should be fine. If there's an error it should be easily fixable with an if statement or similar so let me know if this happens, I'd be glad to fix issues and improve Docker support after GSOC. Once we're sure the rest of the code is OK we're good to merge even if Docker support isn't extensively tested.

That being said, I'm not sure what your policy is on users getting errors but I'm all for pushing new code as soon as possible as long as it's flagged experimental so users are aware that it might crash and can't crash the rest of the product. This way users will start using/testing it for us and report bugs without getting frustrated. I'd say DO IT! :D
Feel free to try it before it gets polished and merged in to GNS3 master branch. Install the latest versions of GNS3 server and GUI from the docker branches:

• https://github.com/GNS3/gns3-server
• https://github.com/GNS3/gns3-gui
Make sure the user running GNS3 has permissions to create and manipulate Docker containers. This is usually accomplished by adding that user to the docker group and the popular Linux distributions *should* do the rest but check the Docker documentation on how to run Docker without root permissions.
Also, Docker support requires ubridge to manipulate network namespaces and do UDP tunnels. Use the latest version from master branch:
• https://github.com/GNS3/ubridge
It's a cool little project in by itself and has some interesting features.
I'll write something more detailed on how to use this new Docker support but until then - try it yourselves!

## August 25, 2015

### Christof Angermueller(Theano)

#### GSoC: Endgame

It’s already the end of GSoC2015! I used the remaining two weeks to wrap up my project. Specifically,

• I made some further layout improvements and fixed remaining bugs,
• I tested the visualizing in different browsers and wrote test cases,
• I made sure that all interfaces are documented, and
• I wrote a user guide.

Finally, I submitted a pull request, which will hopefully be merged soon into Theano’s master branch! Then it will be your job to test the new d3viz module for interactive visualizing of Theano graphs, and to let me know about issues and features requests!

The post GSoC: Endgame appeared first on Christof Angermueller.

## Overview

I am very happy with the outcome of my work at GSoC. On the one hand it is true that the complete list of goals in the original application has not been accomplished. On the other hand, the focus of contributions changed once I started working. I implemented a lot of observation handling code that was not already available as planned, instead of background model application to source observations. Indeed, the majority of the items in the API proposal have been worked out and added as functionality to Gammapy. In addition, I participated in the necessary code cleanup for the release of Gammapy version 0.3.

List of main pull-requests during my GSoC (all merged in the trunk):
• Document observation tables and improve gammapy.obs [#278]
• Observation table subset selection [#295]
• Add cube background model class [#299]
• Make background cube models [#319]
Other relevant pull-requests (also merged):
• Add function to fill acceptance image from curve [#248]
• Consistent random number handling and improve sample_sphere [#283]
(There are more (smaller) pull-request dealing with cleanups, fixes or small additions that are not listed here.)

All in all, a lot of new functionality has been successfully added to Gammapy, as demonstrated by the example in the example section in the final project report part 1 post, and the examples in the documentation links listed below, making this a fruitful project.

As summary of my work produced during the GSoC, I am repeating the list of links to the documentation I produced during the GSoC, that I already posted in the progress section in the final project report part 1.
This documentation explains the most important contributions produced during my GSoC project:
I would like to take the opportunity to thank the mentors of the project for their useful input. There was at all times at least one of them there to answer my questions and give positive feedback. I learnt a lot during this summer!

For instance I deepened my knowledge of the scientific python stack: numpy, scipy, matplotlib. I learnt of course the basic tools needed for my project: git, GitHub, Astropy, Gammapy. I also learnt how to work collaboratively in a pull-request system with code reviews.

These recently acquired knowledge, together with my previous skills in programming with object-oriented languages granted me access to the list of Gammpy core developers and maintainers with write access to the repository.

To conclude, the completion of this works has taken many hours of hard work (~ 500 h), much satisfaction, some frustration and 0 drops of coffee! :-)

### Aman Jhunjhunwala(Astropy)

#### GSoC 15 Final Post : Until we meet again !

GSoC finally ends !

I hope everyone agrees that it has been an exciting and exhausting summers ….. probably the best 3 months        we have experienced till date ! There were ups and there were downs with all of us , but overall it has been a heck of an amazing journey. Given another chance, I would definitely wish to experience it again !

The project was successfully completed ! All changes, feedback,etc have been integrated successfully . The website is up and running at http://www.astropython.org . However my association with the project never ends . I will always be the maintainer of the project and be always ready to help out if anything happens to it .ever. The code has been shifted to the Astropy Organization. And the final evaluation of GSoC lies complete !

A huge thank you to Python Software Foundation and Astropy for taking me with them on this wonderful journey ! Everyone at my organization has been very helpful and supportive…. I cannot thank my mentors , Tom Aldcroft and Jean Connelly enough …. they are one of the best people I have ever had the luck of meeting !

Signing off ,till we meet next time,

Because the Open Source Journey never ends….

Aman

### Julio Ernesto Villalon Reina(Dipy)

#### Final Report GSoC 2015

Tissue classification to improve tractography

The main aim of this project is to implement an image segmentation algorithm that is able to classify the different tissue types of the brain using structural and diffusion MRI images. The ultimate goal is to add this algorithm to the set of tools available in DIPY (Diffusion Imaging in Python: http://dipy.org) and incorporate the resulting tissue probability maps in the processing pipelines for tractography [1] (Anatomically-constrained tractography). We decided to use a Bayesian approach for the segmentation, as has been previously done by Zhang et al. and Avants et al. [2, 3].

The Bayesian approach to image segmentation

Bayesian image segmentation algorithms use a likelihood model, also known as a constant observation model, and a prior probability model. The product of these two is equal to the posterior probability P(x|y), where x is the label and y is the input image. The optimal label for a voxel x ̂ is found by calculating the maximum of this posterior probability (Maximum A Posteriori or MAP). This is defined by:

(1)
(2)

Where P(y|x) is the likelihood and P(x) is the  prior and 1/p(y) is a constant. The constant observation model chosen for our algorithm is the Gaussian log-likelihood.

(3)

Here θ are the parameters, i.e., the mean denoted by μ and the standard deviation denoted by σ. This type of observation model has been used for brain segmentation by other researchers in the past [2]. It is important to note here that despite its simplicity, the gaussianity assumption constituted one of the main challenges we encountered, as it will become clear further in this report. We also used this likelihood to calculate the very first segmentation of the input image. This initial segmentation is suboptimal but it is already a close approximation to the optimal segmentation. It is noted that no contextual information is used for the initial segmentation. In Figure 1 we show a T1-weighted coronal slice of a healthy young adult and the corresponding maximum likelihood segmentation.

Figure 1.

Markov Random Fields

We modeled the prior probability (second factor of equation 1) with Markov Random Fields, which do take into account contextual information. The idea behind the MRF models is based on the simple intuition (nowadays formalized in a mathematical theorem) that a specific label for a voxel is favored by the label homogeneity of its neighborhood. This assumption is formally called the “locality” assumption and together with the “positivity” assumption determines that a random field is to be markovian. The positivity assumption states that P(x) > 0.

By taking this into account an MRF distribution can be modeled as a Gibbs distribution:

(4)

A Gibbs distribution is characterized by a constant Zfactor and an “energy” term U(x). This energy function is defined by the neighborhood and is assumed to be the sum of all the pairwise clique potentials (a voxel with each of its neighbor voxels) and is based on the Ising-Potts model. In order to determine the optimal label, the Gibbs energy needs to be minimized.  This is where the Expectation Maximization (EM) algorithm comes into play. In each iteration, we have an E-Step, in which we update the label of the voxel by minimizing the energy function of the Gibbs distribution and we have an M-Step in which we update the parameters (mean and variance) to calculate the log-likelihood again, to which the Gibbs energy will be added again, and the E-Step and M-Step take place for another iteration. The minimization of the Gibbs energy functions is accomplished with the ICM (Iterated Conditional Modes) algorithm. Here is where one of the most important parameters of our algorithm comes into place. The beta parameter, also called the “smoothing” factor, basically determines how much weighting is given to the neighborhood according to the MRF prior model.

Implementation details

Our algorithm needs 4 input variables: a)the gray-scale MRI image, b) the number of tissue classes desired to be segmented; c) the number of iterations to go through the E and M steps; and d) the beta value. As it is shown in Fig#2 the choice of beta makes a difference, namely the higher the beta the “smoother” the segmentations are, which does not necessarily mean a better segmentation.

Choosing the right amount of iterations is also critical. With just one or two iterations the end result is not optimal, thus usually more than ten iterations are needed. In Figure 3 we show the resulting segmentations after many iterations with a beta value of 0. It can be seen how the segmentation changes from iteration to iteration. Also, by examining the total final energies it is possible to see when the algorithm converges. In our tests we made sure that the total energies would always be minimized after each iteration as can be seen in Figure 4. Also, we found that convergence is reached when the difference between the total energies of two consecutive iterations is of -3 orders of magnitude (10e-3). We noticed that the energy initially drops drastically in the first two iterations and then stabilizes around 4-6 iterations, at which the total energy reaches a minimum and then it starts oscillating with a small variation.

Figure 2.

Figure 3.

Figure 4.

One of the outputs of the algorithm is the probability map for each tissue type. So there will be as many probability maps as the number classes specified as input. For tractography applications, most of the times the brain is segmented into three tissue classes, namely cortico-spinal fluid (CSF), gray matter (GM) and white matter (WM). Especially the white matter probability map is used for so called “Anatomically Constrained Tractography” [1] in which this map is used as a stop criterion for the tracking algorithm. An example of these probability maps for the same slice shown in Figure 1 is shown in Figure #5.

The Gaussian likelihood prior, although simple, gave some problems when accomplishing the segmentation task. The input to our algorithm is assumed to be a skull-striped (masked) brain image. When this step is performed and the skull is removed, the background becomes zero. Since we wanted to include the background as one additional “tissue class” and not have a brain mask as an input, the background converges rapidly to a mean of zero and a variance of zero, from the very first iterations of our algorithm, Hence, this class does not have a Gaussian distribution anymore. Understanding the behavior of this was one of the major roadblocks during the development of the algorithm.

Figure 5.

Tractography

We tested our algorithm on one subject and used the output to compute tractography. The type of tractography that we performed is called the “Anatomically Constrained Tractography”, also known as ACT [1]. Tissue specific probability maps are required for this type of tractography. We performed ACT by using tissue probability maps derived from two different types of images, on a T1-weighted image and on a Diffusion Power Map (DPM) [4].

As a first step we ran the tissue classifier on the whole 3D volumes (T1 and DPM). We used a beta value of 0.01 and 20 iterations in both cases. The images were skull-stripped in advance and in the case of the T1 weighted image we used the N4 algorithm for bias field correction. The segmentation results for the T1 can be seen in Fig.6 and the probability maps of each tissue class (CSF, GM and WM) in Fig.7.  The segmentation for the DPM can be seen in Fig.8 and the three tissue probability maps in Fig.9. In figure 10 we show the resulting tractographies when seeding only in the Corpus Callosum and by using both types of tissue probability maps.

Figure 6. Tissue classification of subject's T1-weighted image

Figure 7. Tissue probability maps of subject's T1-weighted image

Figure 8. Tissue classification of subject's Diffusion Power Map

Figure 9. Tissue probability maps of subject's Diffusion Power Map

Figure 10 shows the resulting ACT tractographies of the corpus callosum.

Conclusions

We developed and fully tested a brain segmentation algorithm based on a Bayesian framework by using the Markov Random Fields Theory and Expectation Maximization. This algorithm proved to work on T1-weighted images as well as  on dMRI derived maps, such as the DPMs [4]. The algorithm could successfully segment white from gray matter, even in troublesome areas of high concentrations of crossing fibers. We were able to use the derived probability maps from both the T1 and the DPMs for ACT. Further work will include an extensive validation of the algorithm, a comparison against traditionally used toolboxes (FSL and ANTS) and look for quantitative methods to compare the tractographies derived from tissue maps of  T1 vs. those derived from DPM.

REFERENCES

[1] Smith, R. E., Tournier, J.D., Calamante, F., & Connelly, A. Anatomically-constrained tractography: Improved diffusion MRI streamlines tractography through effective use of anatomical information. NeuroImage, 63(3), 1924-1938, 2012.

[2] Zhang, Y., Brady, M. and Smith, S. Segmentation of Brain MR Images Through a Hidden Markov Random Field Model and the Expectation-Maximization Algorithm IEEE Transactions on Medical Imaging, 20(1): 45-56, 2001

[3] Avants, B. B., Tustison, N. J., Wu, J., Cook, P. A. and Gee, J. C. An open source multivariate framework for n-tissue segmentation with evaluation on public data. Neuroinformatics, 9(4): 381–400, 2011.

[4] Flavio Dell'Acqua, Luis Lacerda, Marco Catani, and Andrew Simmons, Anisotropic Power Maps: A diffusion contrast to reveal low anisotropy tissues from HARDI data. The International Society for Magnetic Resonance in Medicine. Annual Meeting 2014, 10-16 May 2014, Milan, Italy

The most rewarding part of the program was the mentorship and the progress I made while being taught and guided by such a great team of people. The meetings were long and very into the details of the code and the testing. No single line was left out of sight as every single line was tested. It was the first time I did such a thorough testing of my code and I am sure this will help me a lot for all my future endeavors. One interesting event during this summer was that I had the chance to meet with my mentors at a conference (OHBM 2015). GSoC also helped me with 500 to go to the conference. Without this I think things would have been a bit harder. Talking directly with my mentors and getting to know the general idea of what this open source initiative for brain imaging is, was of invaluable help and very enlightening. An advice I would give to future GSoC participant is not to give up if things aren't working as expected or the way you proposed. If there is a defined end goal there are for sure many different routes to get there. ### Manuel Paz Arribas(Astropy) #### Final project report part 1: progress The last 2 weeks of Google Summer of Code (GSoC) have been a bit intense and exciting. ## Gammapy 0.3 First of all Gammapy version 0.3 has been released. It is still an alpha version of Gammapy. But it contains already some of the functionality I developed for the GSoC. For instance: • dummy observation table generator. • container class for cube data. • dummy background cube model generator. • observation selection tools. • many small fixes. ## Progress As for my work on the last 2 weeks, I spent much of the time refactoring the code of the background cube model production script mentioned in my previous report and integrating the functionality into Gammapy. In addition, I added some tests to assure that the recently added code works, and documented the new classes, methods and scripts. New functionality added to Gammapy: • Observation grouping classes: classes to define groups of observations with similar properties (like observation altitude/azimuth angle or number of telescopes participating in the observations) in order to be analyzed in groups instead of individually • Format converters: methods to convert the observations list formats of different experiment into the Gammapy observation list format. • Dummy dataset generator: methods to generate a dummy dataset with event lists (data) and effective area tables (instrument response functions) in order to test some of the Gammapy functionality. The tools work as follows: • generate a dummy observation table, generated with the tool mentioned above. • Simulate background data (using a very simple model) according to the produced observation table • Store the data emulating the data structure of an existing experiment, like H.E.S.S.. • Class to handle the creation of background cube models: the methods acting on background cube models have been divided into 2 classes: • Cube: the basic container class for cube data mentioned above. This class is kept as generic as possible, allowing other classes to use it to contain other kinds of cube data in the future. It contains basic I/O functionality, plotting methods, and methods operating on the cubes. • CubeBackgroundModel: class with methods specific for the background cube model productions, such as binning, histogramming, and smoothing. It also defines 3 cubes necessary for the model production: • counts cube: contains the statistic participating in the model. • livetime cube: contains the livetime correction to apply. • background cube: contains the model. • Command line tool to run the production of background cube models: this tool takes as input a dataset either from an existing experiment like H.E.S.S., or simulated data produced with the tools mentioned above and produces the models in several steps: • Produce a global observation list, filtering observations taken close to known sources. The tool selects all observations far from the sources listed in the TeVCAT catalog. • Group the observations according to similar observation properties, in this case: altitude and azimuth angles. • For each group produce the background cube models by: • define binning. • fill events and livetime correction in cubes. • fill events in bg cube. • smooth the bg cube to reduce fluctuations due to (low) Poisson statistics. • correct for livetime and bin volume. • set 0 level to something very small. • Example script to make a few plots comparing 2 sets of background cube models. • Added a new file to gammapy-extra with a test CubeBackgroundModel object More details on the background cube model production can be found online in the documentation I produced during the GSoC, especially in the last week: ## Example As an example of the new capabilities of Gammapy in order to produce background models, I produced 2 different models using the tools I developed and documented here and compared them: • A simulated background cube model (a.k.a. true model). • A reconstructed model (a.k.a. reco model) using simulated data following the model used for the true model. The models very roughly represent the background seen by a H.E.S.S.-like experiment at an altitude close to the zenith and South azimuth. Using the example script to plot 2 models together for comparison I produced the following plot (click for an enlarged view): The plot shows that the reconstructed (reco) model agrees quite well with the simulated (true) model. The following animated image shows the data of the reco model (click for an enlarged view): Each frame represents one energy bin of the model, ranging from 0.1 TeV to 80 TeV. The pictures represent the (X, Y) view of the model in detector coordinates (a.k.a. nominal system), covering a 8 x 8 square degree area. It is shown that Gammpy can be now used to produce accurate background cube models. ## August 24, 2015 ### Rafael Neto Henriques(Dipy) #### [RNH post #14] Final Project Report Hi all! The GSoC coding period is now over! Having participated on the GSoC was an amazing experience. In general, all objectives of my project were accomplished. Now, the scientific and wider imaging processing community can have access to the first open source DKI processing modules. As the results of this project showed (see for example my post #10 and post #11), this modules can be used to analyse data of large world wide collaborative projects such as the Human Connectome Project (HCP). Moreover, I had a great time working with members of my mentoring organization - I learned a lot with them and I will definitely continue contributing with Dipy in the following years. Below you can find my final project report. ## Project summary In sum, this project was organized in 4 main phases: 1) finishing the work done on functions to simulate signal from the DKI model; 2) implementing methods for estimating the diffusion kurtosis tensor and derived measures; 3) adding a procedure to estimate biophysical parameters from DKI; and 4) developing techniques to estimate fiber directions from real DKI data. The details of the work done on each phase is described below: DKI based simulations In this part of the project, I implemented the DKI simulates that were important to test the performance of all functions created on the other steps of the project. Part of this work was done before the GSoC coding period and its finalization was reported in the midd-term summaryJust to highlight the relevance of these simulations, during the GSoC coding period, 19 nose tests functions were created in which 13 were based on DKI based simulates. Moreover, DKI simulations were also useful for selecting, optimizing and debugging DKI methods (see for example post #9 and post #13). DKI reconstruction modules As I proposed in my initial project plan, having a final version of the DKI fitting modules and estimation of diffusion kurtosis statistics was the main goal to achieve for the midd-term evaluation. Since these modules provide the base for the work of the other parts of the project, I decided to dedicate some more time of the second half part of the GSoC coding period to improve the diffusion kurtosis statistics functions. These improvements are summarized in the following points: • The analytical solutions of the mean and radial kurtosis were validated using two numerical methods (post #9). • The performance of the functions were improved so that all standard kurtosis statistics can be computed within 1 min (post #10) • I also explored some Dipy's pre-processing steps that dramatically improved the quality of the DKI reconstructions (post #11 and post #12). • I added some nosetests to insure that all code lines of DKI reconstruction modules were covered by nosetest units. From this, I detected some problems with singularities on the function computing the mean kurtosis, which were solved as reported in post #13). • The sample usage script of these modules was adapted according to a new DKI dataset which was required with similar parameters to the HCP. Below we show the kurtosis statistics images obtained from the HCP-like data using the DKI reconstruction modules before (upper panels of Figure 1) and after (lower panels of Figure 1) the improvements done on the second half part of the GSoC term.  Figure 1 - Diffusion Kurtosis statistics of the HCP-like data obtained from the implemented DKI reconstructions before (upper panels) and after (lower panels) the optimization done on the second half part of the GSoC coding period. Optimized functions seem to correct the artefacts present on the white matter regions as from the splenium of the corpus callosum. The final version of the DKI modules can be found in the following pull request. DKI based biological measures Given the extra work done on the previous step, the implementation of the DKI biological measures was rescheduled to the last couple of weeks of the GSoC period. These measures were obtained from the DKI based model proposed by Fieremans et al., (2011), which allows the estimation of concrete biophysical parameters from brain regions of well aligned fibers. Until the end of the coding period, great advances were done on this module. For example, Figure 2 shows the estimated values of axonal water fraction (the proportion of water presented inside the fibers) for voxels containing well-aligned fibers of the splenium and genu of the corpus callosum obtained from the current version of this DKI biophysical model.  Figure 2 - Axonal water fraction values of the splenium and genu of the corpus callosum (red-yellow colormap values) plotted over the first b-value=0 of the HCP-like diffusion-weighted dataset. Unfortunately, since the final version of these functions depends on the other pull requests that are currently being revised, the work done on the implementation of the biophysical models was not finalized, and thus it will not be submitted as part of the GSoC code sample. However, I intend to finalize soon these codes after the GSoC. If you are interested on looking to the final version of the biophysical metric estimations, keep tuned to the updates done at the DKI reconstructions pull request DKI based fiber direction estimation methods As planed on the project proposal, in the second half part of the GSoC coding period, I developed a procedure to predict the fiber direction estimates from DKI. This was done by first estimating an orientation distribution function (ODF) which gives the probability that a fiber direction is aligned to a specific spatial direction (post #9). From the ODF, fiber directions can be estimated by finding the maxima values of the ODF (post #10). On the last couple of weeks, I accomplished a final version of this procedure by writing its sample of usage script, where real brain data ODF and fiber directions are estimated. Visualizations of these estimates are shown in Figure 3 and 4.  Figure 3 - DKI based orientation distribution function (ODF) computed for voxels of portion of the HCP-like data.  Figure 4 - Fiber directions computed by detecting the directions of maxima ODF. The multiple direction estimates from some voxels show that DKI is able to resolve crossing fibers. The final version of the modules containing the function to estimate fiber directions from DKI can be found in the following pull request. ## Skills gained on GSoC • With the supervision of the members of my mentoring organization, I dramatically improve my programming skills. • I learned all required steps to work on collaborative projects such as Dipy. Particularly, I learned how to share, update and comment my work using Github's development framework. • I learned how to use ipython notebook to create sample script examples, and using ipython profiling techniques to check and improve function performance. • Now I know how to use testing units, as the nosetest units, which allows me to automatically check bugs on the functions that I am implementing. • I also learn how to improve functions using cython. • Finally, I got familiarized with Dipy's structure and how to use their function. This is a useful knowledge for my personal future research. ### Wei Xue(Scikit-learn) #### GSoC Final Project Report GSoC is approaching its end. I am very glad to have such great experience this summer. I explored the classical machine learning models, Gaussian mixture models (GM), Bayesian Gaussian mixture models with variational inferences (BGM), and Dirichlet Process Gaussian mixture (DPGM). The code and doc is in PR4802. Besides these issues, I did some animations and IPN for these three models. In conclusion, I finished the tasks of in the proposal, but I didn't have time to do the optional tasks, i.e., the incremental EM algorithm and different covariance estimators. Anyway, after GSoC, I will continue to contribute to the scikit-learn project. ### Siddharth Bhat(VisPy) #### GSoC VisPy report 6 This will be the final update for VisPy while I’m officially under GSoC. There’s still some work to be done, so there are still going to be updates. ## Vispy.Plot - merged The changes to plotting that I’d worked on have been merged successfully! That’s a huge chunk of my GSoC work that’s been integrated into master. Here’s the merged pull request There’s a few things left dangling out, (mostly code improvements that need to happen) which is all documented on this issue ## Grid System Like I wrote the last time, the Grid system is taking shape. However, there are still a few kinks to be worked out, which caused me to cross over the deadline :) Here is the pull request ## Viridis A small change pulling the Viridis colormap from matplotlib was added to VisPy. The pull request exposed a pretty interesting design flaw in VisPy - the shader code for the colormaps doesn’t actually render to a texture - it creates conditional branches which causes a sharp performance decrease as the number of control points increase. I’ll try and fix this once GSoC ends, since it seems very focused and doable. ## Odds and Ends GSoC was really fun and interesting as an experience! I wish I’d set goals slightly more realistically, to account for unexpected events and lost time. However, I think I did okay in that regard, and I suppose you live and learn :) I sure did learn a lot, and I’d love to take this further. I’m most definitely still going to contribute to VisPy - seeing the library succeed will be awesome. I have a couple of long-term goals with it, including porting core parts of the library to C/C++ for better performance. However, that’s all very up-in-the-air at the moment. I’ll blog about that as things get more settled. until then, Adiós! ### Jakob de Maeyer(ScrapingHub) #### The party's over :( Alright, that’s it. The “firm pencils down” date has come and gone last Friday. As a result, my two major pull requests, the per-key priorities for Scrapy’s dictionary-like settings and the add-on framework, are ready to undergo full review, and hopefully ready to be merged after a few small fixups if needed. In my proposal, I had set a command line interface to managing add-ons as stretch goal. However, during the course of the Summer of Code, the implementation and usage of the add-on framework has shifted a little from “very user-friendly” to “very correct and unambigous”, and most of the features I had originally intended for the CLI do not make much sense anymore. Julia, my mentor at Scrapy, and I had therefore decided to set a different stretch goal: Allowing spiders to implement the add-on interface as well. Currently, spiders can already implement a update_settings() (class) method, and it only makes sense that they should be able to implement the other add-on callbacks as well. I’m happy to report that, although there are still a few design decisions that need to be made, there is a working implementation in a new PR! So, this blog post and the evaluation form I’ll fill out afterwards mark the end of my Summer of Code. BUT, not it all do they mark the end of my involvement with Scrapy and the Open Source community! :) The past three months were a blast for me; I learned a lot just from Scrapy’s very well-maintained codebase, and even more from implementing the add-on framework. On top of that, I got to do it in a very welcoming atmosphere! So if you are contemplating to apply for Summer of Code, or to get into open source development: Do it, do it, do it! :) ### Siddharth Bhat(VisPy) #### GSoC VisPy - Update #3 Here’s the next update, slightly delayed because of college. ## SceneGraph is merged The massive SceneGraph PR that was open all this while has been merged into vispy master, bringing closure to that beast. That’s part 1 of my GSoC project officially complete! All of the other changes by me reference a single pull request. ## Borders The colorbar had code to render pixel-width borders, that needed to be adapted for a generic case So, the ColorBarVisual was split again, yielding a BorderVisual that’s used to draw borders in other parts of VisPy. ## Text Positioning This was the most challenging part of the past two weeks. The ColorBar needed to have text that was placed correctly, independent of orientation or transforms associated with it. I wrote and re-wrote code for this, but I wasn’t able to hit on the right solution for close to two days. I talked to Eric, my mentor, who suggested simplifying the code to handle the most simple case and build it up for there. That worked beautifully, to my astonishment. Simplifying it let me see the flaw in what I was doing, and incorporating that was a really simple job once that was done. ## Colorbar as a Widget Until now, the colorbar was sitting in the lower levels of VisPy as a ColorBarVisual. Bringing it up to a higher abstraction level required writing a Widget for the colorbar, which came with its own set of interesting problems. The API is now nice and easy to use, since the colorbar tries hard to auto position itself. It figures out position and dimensions. It’s as easy as setting a colormap and having things work, which feels nice to use. ## vispy.plot - ColorBar The vispy.plot module uses the ColorBarWidget to provide colorbars for plots. As of now, it’s simplistic, since it only places the color bars in different orientations. The next iteration should have a few more features up its sleeve - automatically figuring out the data limits for one. ## Widget Placement The plan for next week is to implement the Cassowary constraint algorithm for Widget placement. It should be fun, because I don’t know much of linear optimization. The Cassowary algorithm is a derivative of the simplex algorithm, so that looks like a good place to start when it comes to implementation. ## Odds and Ends - Viridis Viridis is a colormap that was created as an alternative to jet. There’s a PR that implements it in VisPy. ### Vito Gentile(ERAS Project) #### Enhancement of Kinect integration in V-ERAS: Final report The “pencil down” date for Google Summer of Code 2015 has passed, and now it’s time to summarize what I have done during these months. It has been a very nice experience, mainly because I have had the possibility to work on a unique project, ERAS, by exploiting my previous experiences with Microsoft Kinect. I introduced the project in this blog post, so if you want to have some overview of ERAS and what the Italian Mars Society (IMS) is doing, please read it or go to erasproject.org. The title of my project was “Enhancement of Kinect integration in V-ERAS”, and you can read the full proposal here. In this post I will briefly summarize which were the goals of my project, and how I addressed them. My project can be divided in four main stages: 1. rewrite the existing body tracker in order to port it from C# to Python (which is the main language used in ERAS); 2. implement a GUI to manage multiple Kinects connected to a single Windows machine; 3. improve user’s navigation, by working on the user’s step estimation algorithm adopted in ERAS; 4. integrate gesture recognition ## Porting body tracker from Python to C# The first step was to reimplement the body tracker by using Python language. After the V-ERAS-14 mission, where a non-working tracker written in C++ and based on OpenNI was used, the ERAS team made the decision to adopt Microsoft API, because they have proved to be more efficient in tracking skeletal joints. However, only C#, VB.NET and C++ programming languages were supported by Microsoft Kinect SDK v1.x. The only alternative out there to write a Python-based tracker was to use an open source project called PyKinect. It is basically a porting of the C++ capabilities of Microsoft API in Python, by strongly using ctypes module. PyKinect have proved to be quite reliable and usable, although it was quite poorly documented. After some tests (some of which are now available as snippets on my BitBucket account, for everyone who wants to start with this Python solution for Kinect), I was able to implement a working tracker, which was able to recognize skeletal joints and send them to a Tango bus. For who does not know it, Tango is the control system on which the whole ERAS software is based. It allows to implement several device server which can publish various kind of data on a bus that can be accessed by any other device server or generic software. In Python, the PyTango module allows to interact with Tango very easily, and it allowed me to implement a Tango device which publishes skeletal joints recognized by Kinect. I strongly recommend to take a look to this useful tutorial on how to implement a Tango device server. This section of my source code is mainly part of tracker.py script. ## GUI for managing multiple Kinects With the previous C# code, I had implemented a GUI for managing multiple Kinect. It allowed to change Kinect tilt angle, and assign a Kinect to a specific Tango device. However, it did not display user’s images (depth, color and/or skeletal data), and this feature has revealed to be useful, mainly during setup of Motivity/Motigravity and at the beginning of user’s interactions. After porting everything in Python, the C# GUI became no more usable, so I had to implement it in Python too. I decided to use the pgu library, because it was one of the solution compliant with pygame, a module used together with PyKinect to implement skeletal tracking. The resulting Python GUI includes the same capabilities of the previous one, with the addition of displaying depth and skeletal data of users. The following image shows how it looks with just one user: The code related with GUI can be found in the gui.py script. ## Improving users’ step estimation algorithm User’s step estimation algorithm is one of the most significant part of ERAS software in terms of affecting user’s navigation in the virtual martian environment. Before GSoC 2015 started, this procedure was implemented as a Blender script, while now the idea is to provide user’s steps data from the body tracker module. During V-ERAS-14 mission, user’s navigation was marked as one of the main issues from a user’s perspective. In order to improve this, I tried to change a little bit the algorithm, which is now implemented in tracker.py (if you are interested, look at the estimate_user_movements function). User’s step estimation has been also used by Siddhant Shrivastava, another student that has worked for a ERAS-related project during this GSoC. He produced a very interesting video to demonstrate how user’s steps can be used to teleoperate a (virtual) rover in ROS: ## Gesture recognition One of the most interesting and challenging features that I have implemented during this GSoC was the gesture recognition capabilities. I focused on recognizing if user’s hands are open or closed. PyKinect, as well as Microsoft API, is not able to recognize this, but an extension made available by Microsoft and called Kinect Interactions includes some gesture recognition capabilities. What I have done was to use the DLL file available with Microsoft Kinect Developer Toolkit 1.8, and convert C++ methods available in it to something usable in Python. To do this, I wrote some C++ code that simplify the porting, and relied on ctypes to use C++ methods in Python code. You can read something more in this post. Until now, the system is capable to recognize if the hands are open or closed. Kinect Interactions should allow also to recognize a “press gesture”, but it seems to be more complicated than expected. The code related to gesture recognition is available here (C++) and here (Python). ## Other contributions While the main goals of my project was the four ones described above, I also contributed with something not previously scheduled, but useful for future applications. I have implemented an algorithm to estimate user’s height, which can be retrieved as result of a Tango command (get_height). You can see the related code here. I have also helped the IMS with some user data collection and analysis, by writing some Python scripts used for aggregating data taken from Oculus Rift and Microsoft Kinect during training sessions for AMADEE (see this post for more details). ## Conclusions I hope to have exhaustively described all my contributions to the ERAS project during this GSoC 2015. In the future I think I will continue in collaborating with the Italian Mars Society, to improve Kinect integration a little bit more (we still need some full integration test to verify if everything is working correctly). It was a great experience, and I hope to participate to the next GSoC too. But for the moment that’s all! Ciao! ### Abraham de Jesus Escalante Avalos(SciPy) #### Goodbye GSoC. Hello all, This is my final entry for the GSoC. It's been one hell of a ride, nothing short of life changing. When I started looking for a project, the idea was that even if I didn't get selected to participate, I was going to be able to make my first contribution to OpenSource, which in itself was enough motivation to try it. I found several interesting projects and decided to apply to one of them, the one that I considered to better suit my situation at that moment. Before I got selected I had already interacted with a few members of the community and made a couple of contributions. I was hooked on OpenSource so there was no looking back. By the time I got selected, the GSoC had already met my expectations. I found a healthy community in SciPy and I could not have asked for a better mentor than Ralf (Gommers). The community members were always involved and supportive while Ralf provided me with enough guidance to understand new concepts in a simple way (I'm no statistician) but not so much that I would be overwhelmed by the information and I still had room to learn by myself, which is an essential part of my learning process (that's where I find the motivation to do the little things). After starting the GSoC I received the news that I was denied the scholarship to attend Sheffield and my plans for a master's degree were almost derailed. I then got an offer to study at the University of Toronto and this is where it got interesting (spoiler alert: I am writing this blog entry from downtown Toronto). I went through the scholarship process again and got selected. I also went through the process of selecting my courses at the UofT. With Ralf's guidance and after some research I decided to take courses on Machine Learning, Natural Language Processing and other related topics. I can now say with pride that I am the newest member of the SciPy community which will help me in my journey towards becoming a Machine Learning expert or maybe a Data Scientist, that remains to be seen, but we already have some plans on how I can keep contributing to SciPy and getting acquainted with the pandas and Numpy communities. I'd like to see what comes from there. As you can see, I got a lot more than I had expected from this experience, which I attribute to having approached it with the idea of searching for a passion to turn into a career. Naturally I found it, so now it's time to switch gears. I would like to use the last paragraph of this rant to give out some thanks. Thanks to Ralf for walking me along to find my own path within the beautiful world of OpenSource and Scientific Computing. Thanks to the SciPy community, especially to Josef Perktold and Evgeni Burovski for providing so much valuable feedback to my PRs. Thanks to Google for organising an event like this, helping people like me with the excuse they need to finally join OpenSource and stop leaving it for later. And of course, thanks to the people in my life that provide me with a reason to wake up and try to be a little better than the day before: My girlfriend, Hélène, who keeps my head above the water when I feel like I forgot how to swim by myself and my parents, whose love and support seem to have no end. You make me feel like I owe it to the world to be the best I can be (or try, at the very least). ## August 23, 2015 ### Jaakko Leppäkanga(MNE-Python) #### Cooling down Whoops... It looks like I forgot to update this blog a week ago. Here's the update. So I got the interactive TFR merged. That was basically the last thing on my todo-list. I also made some cosmetic fixes to scalings along the plotting functions and fixed a couple of bugs that have been in the code for quite a while, I assume. Now the project is done and I can prepare for moving to Paris, as I was offered an engineer position for continuing the work on MNE-python. Over and out. ### Isuru Fernando(SymPy) #### GSoc 2015 Week 12 & 13 This week we announced the release of SymEngine on Sage list. For that, I made some changes into the build system for versioning and to use SymEngine from other C/C++ projects. First, SymEngineConfig.cmake would output a set of flags, imported dependencies, etc. SymEngineConfigVersion.cmake would check that the version is compatible and if the 32/64-bitness is correct of the SymEngine project and the other CMake project. When SymEngine is only built, then these files would be at the root level and when installed they would be at /lib/cmake/symengine. An excerpt from the wiki page, I wrote at, https://github.com/sympy/symengine/wiki/Using-SymEngine-from-a-Cpp-project ##### Using SymEngine in another CMake project To use SymEngine from another CMake project include the following in yourCMakeLists.txt file find_package(SymEngine 0.1.0 CONFIG) You can give the path to the SymEngine installation directory if it was installed to a non standard location by, find_package(SymEngine 0.1.0 CONFIG PATHS /path/to/install/dir/lib/cmake/symengine) Alternatively, you can give the path to the build directory. find_package(SymEngine 0.1.0 CONFIG PATHS /path/to/build/dir) An example project would be, cmake_minimum_required(VERSION 2.8)find_package(symengine 0.1.0 CONFIG)set(CMAKE_CXX_FLAGS_RELEASE{CMAKE_CXX_FLAGS_RELEASE} "-std=c++0x")include_directories(${SYMENGINE_INCLUDE_DIRS})add_executable(example main.cpp)target_link_libraries(example${SYMENGINE_LIBRARIES})
More options are here
##### Using SymEngine in Non CMake projects
You can get the include flags and link flags needed for SymEngine using the command line CMake.
compile_flags=cmake --find-package -DNAME=SymEngine -DCOMPILER_ID=GNU -DLANGUAGE=CXX -DMODE=COMPILElink_flags=cmake --find-package -DNAME=SymEngine -DCOMPILER_ID=GNU -DLANGUAGE=CXX -DMODE=LINKg++ $compile_flags main.cpp$link_flags
##### Python wrappers
There was a suggestion to make the Python wrappers separate, so that in a distribution like Gentoo, the package sources can be distributed separately.
So, I worked on the Python wrappers to get them to be built independently or with the main repo. Now, the python wrappers directory along with the setup.py file from the root folder can be packaged and they would work without a problem.

### Palash Ahuja(pgmpy)

#### Project Completed .. :)

I am basically done with my project for building the modules for Dynamic Bayesian Network, with the ability to do inference over it. I am currently waiting for my PR #465 to be merged right now.

Google Summer of code(GSoc) as an experience overall seemed challenging and at the same time felt very conducive for learning, as I have learned about stuff, that I may not have learned outside of the program. Sometimes when the difficulties were too much of a burden, the capacity to overcome them and solving them requires a certain amount of effort, endurance and patience. This is what GSoc has really taught me.

I am also planning to add the approx. inference module for Dynamic Bayesian Network after my Gsoc, since we now have a stable implementation for likelihood weighting , forward and rejection sampling( Thanks to Pratyaksh).

### Pratyaksh Sharma(pgmpy)

#### End of GSoC coding period

The firm 'pencils-down' date was a couple of days ago, Aug 21.

I'm done with major part of the planned project, with the latest PR#457 yet to be merged.

I'll continue to add tests and documentation for the added modules. Also, the code for Metropolis-Hastings (another MCMC sampling algorithm) is almost ready to be pushed.

Keep watching this space for updates!

### Ziye Fan(Theano)

#### [GSoC 2015 Week 11 & 12]

Finally I found what caused the PR of local_fill_sink optimization can not pass all test cases.

The failed test case is Test_local_elemwise_alloc. It tests local_elemwise_alloc, which replace fill node with alloc. The test case checks if the numbers of alloc and assert are right in the optimized function graph.

What cause its failure is the modification of theano.config.experimental.local_alloc_elemwise_assert in T_local_switch_sink (I was trying to fix T_local_switch_sink's failure at that time and tried to turn this option to be False to prevent local_alloc_elemwise to create assert nodes) in my commit before. The option is set to be False there, so in Test_local_elemwise_alloc, local_elemwise_alloc cannot create new assert nodes --> the number of assert nodes is wrong --> failure.

The solution is quite easy, just delete that line then we are safe. But what confused me is how can the test result be different between my own computer and the travis building server.

The next optimization maybe to remove merge3, but it need to be discussed.

#### [GSoC 2015 Week 9 & 10]

In week 9 and week 10, I finished making MergeOptimizer be able to deal with nodes with assert inputs.

The Pull Request is here. It can handle following 3 cases.

1).

OP( x, y, z )
OP( assert(x, cond_x), y, z )
====
OP( assert(x, cond_x), y, z )

2).

OP( assert(x, cond_x), y, z )
OP( assert(x, cond_y), y, z )
====
OP( assert(x, cond_x, cond_y), y, z )

3).

OP( assert(x, cond_x), y, z )
OP( x, assert(y, cond_y), z )
====
OP( assert(x, cond_x), assert(y, cond_y), z )

Fore new test cases were also created.

### Daniil Pakhomov(Scikit-image)

#### Google Summer of Code: Implementing the training part of face detection

Here I will go into the details of implementing the training part of face detection algorithm and the difficulties that I faced.

## Overall description of the training process.

The training consists of two logical parts:

1. Training the classifier using Gentle Adaboost.
2. Creating the Attentional cascade using trained classifiers.

The Gentle Adaboost part was implemented with the help of the original MBLBP paper but with a little difference in the tree implementation as in OpenCV, which was very hard to do because there are very little web resources or papers available online with description on how the algorithm works. And I had to read the source code. I will go into details about it in the respective section.

The cascade creation part was implemented using the Viola and Jones paper.

One more problem that I have faced during implementation was that I had to write everything using Cython without Python calls, because the training code should be very efficient, otherwise it will take to long to train a classifier for user. I had to do everything without using any numpy matrix operations. Also that fact that I was working with raw arrays made the debugging really hard.

I wasn’t able to finish the training part by the deadline of GSOC and had to work a little bit more which didn’t count towards GSOC but still I wanted to finish this part.

The Gentle Adaboost works by training classifiers based on the training set and weights which describe the importance of a particular training example. In our case we have faces and non-faces. When it starts, it trains a decision tree with equal weights for each example. After the training, some examples are misclassified. Therefore, we put more weight on the examples that were misclassified and less weight on the examples that were correctly classified and train another decision tree using new weights. Then we repeat the same process and at the end we have a strong classifier that combines outputs of multiple weak ones (decision trees in our case). This algorithm is used to create strong classifiers for each stage of our cascade.

The one big problem that I faced was that the decision trees that are used in the the original MBLBP paper and the decision trees that are used by OpenCV are different. The original paper uses a regression decision tree with 256 branches and OpenCV uses a binary decision tree.

Because we followed the OpenCV APi from the very beginning I had to figure out how to train binary tree for our case. The problem seems easy, but it’s not.

As we use Multi-block Binary Patterns, that means that the features (patterns themselves) are categorical variables. For example, numbers are not categorical variables and can be easily ordered and compared to each other. So, in case of numbers if we have values we have to check possible split values. When we deal with categorical variables like colors (red, blue, black and so on) and Multi-block Binary Patterns it means that we can’t compare or order them, so in order to find a best split if we have values we have to try possibles splits. For example, in case of colors (red, blue, black, green) one possible split will be red and blue go to the right branch and others go to the right one. In case of Multi-block Binary Patterns we have values and it means that we have to check possible splits which is not feasible.

While reading the OpenCV documentation about decision trees I have found that there is a special algorithm for this particular case:

In case of regression and 2-class classification the optimal split can be found efficiently without employing clustering, thus the parameter is not used in these cases.

So, as we can see OpenCV uses clustering to solve this problem but in our case it doesn’t use it and instead just mentions that this task can be solved efficiently without clustering.

The only citation that the OpenCV documentation has is:

Breiman, L., Friedman, J. Olshen, R. and Stone, C. (1984), Classification and Regression Trees, Wadsworth.

Which I wasn’t able to find fully and it was only partially included in some lectures where the part mentioning the special algorithm wasn’t available.

After this I started to read the source code of OpenCV related to this problem. I found a function that is responsible for this algorithm but it was really hard to understand what is happening there because the code doesn’t have any citations of the original algorithm or its description.

Finally after spending a lot of time on this I found a one sentence description of that algorithm in the matlab’s description of tree module:

The tree can order the categories by mean response (for regression) or class probability for one of the classes (for classification). Then, the optimal split is one of the L – 1 splits for the ordered list. When K = 2, fitctree always uses an exact search.

So the main idea of the algorithm is to sort each of our categorical variable based on its mean response from training data. In this case we can compare them based on this values and treat them like numbers, reducing the complexity from exponential case to linear.

It was really strange that I found no description of this algorithm online and only this sentence that I found just by chance. It may be the case that this particular case is really rare in real world and nobody teaches it.

One more observation that I had is that by using the original tree from the paper we have a faster training process and a better one but a little bit slower evaluation part. I think in the future after more experiments we can also support the training that uses the original tree from the paper. By now we have the binary decision tree like in the OpenCV.

These are the first 4 most descriptive Multi-block Binary Patterns features that were found by Gentle Adaboost and binary decision trees:

And this is the first feature that was found using the same process but with the original tree from the paper:

As it can be seen the result is better. Because it is more similar to the features that were derived in the Viola and Jones paper. In this case this feature says that the regions or eyes are usually darker than the regions of nose. This is an example of weak classifier. The results are like this because the binary tree is a worse classifier than a tree with 256 branches.

This is why there is still a place to improve the classifier by using trees from the paper.

During the implementation I stricly followed the Viola and Jones paper.

## Results of the work by the end of the deadline of GSOC

By the end of the GSOC I was able to implement the efficient evaluation part for face detection. So the script uses the trained xml file from OpenCV and is able to detect faces on the image. This was made with the help of OpenMP because some parts can be done in parallel and otherwise the detection takes too long.

The training part was partially implemented and the delay was caused by the absence of information about the efficient algorithm for splitting the binary tree in case of categorical variables.

## August 22, 2015

### Siddhant Shrivastava(ERAS Project)

#### Telerobotics - Final Report

Hi all! Yesterday was the firm-pencils-down deadline for the Coding Period and the past week was one of the best weeks of the Google Summer of Code 2015 program. I went all-guns-blazing with the documentation and Virtual Machine distribution efforts of my work on Telerobotics. I also added some significant features to Telerobotics such as ROS Integration with the EUROPA Scheduler which Shridhar worked on this summer with the Italian Mars Society.

# Project Report

I completed the main aspects of the Telerobotics interface with strong results -

• Introduced Robot Operating Sytem (ROS) to ERAS
• Developed a Telerobotics Interface to Bodytracking and EUROPA
• Implemented Stereoscopic Streaming of 3-D video to the Blender Game Engine V-ERAS application

I explain each of these points and summarize my experience in the following paragraphs. In the last week, I got a chance to pursue a collective effort in all the areas of my project -

## Replication Experiments

The ultimate week started with attempts to ensure that my mentors could replicate my machine setup in order to test and comment on the performance of Telerobotics. To that end, I added detailed instructions to describe my machine and network configuration, which can be found here.

## Docker Working!

I explained the importance of Docker in this project in a previous post. Franco started the ball rolling by telling me how the ssh-to-image method could be used for running Qt applications in Docker. ROS and Gazebo employ Qt extensively for their visualization and simulation applications. Thus, it was a non-functional requirement of Telerobotics. Thus the long-standing Docker issue was solved. The final Docker image with everything packaged can be used to test Telerobotics. The image can be pulled from here. The instructions to use the image are in the Telerobotics Documentation pages.

A walkthrough with the Docker image can be found in this YouTube video that I created -

## Fallback Keyboard Teleoperation

Telerobotics works out of the box with the Bodytracking module that Vito has developed. But in the unfortunate case when the Tango-Control server fails, there emerges the functional requirement to have a fallback interface in place. Seeking inspiration from the Teleoperation tools for ROS, I added the Fallback Keyboard Teleoperation interface. Thus, the Rover can now also be controlled with the Keyboard if need be. The controls are currently inclined towards right-handed astronauts. I hope to add the left-handed version soon as a minor extension of the interface. The code for this can be found here.

Shridhar's work on the EUROPA platform needed access to the Telerobotics interface for the following tasks -

• Getting Robot Diagnostic Information
• Navigating the Robot to certain points

I achieved the initial goal before midsems. The second goal was achieved this week after the EUROPA Planner was complete. The workflow to this end was to receive coordinates from the EUROPA Tango Server and send them to the ROS Node corresponding to the Husky.

Finding the optimal path between two points on an incompletely-known map is solved by using Augmented Monte Carlo Localization.

It is necessary to localize the rover with respect to its environment based on the inputs of its multiple sensors. The following diagram from the ROS website explains the concept -

I used the Husky frame coordinates and added the code using the ROS Action Server and Action Client and Tango Event Listeners to create the appropriate Telerobotics-EUROPA interfaces. It can be found here.

## Minoru Camera Tools

The Minoru 3-D Camera that I used to prototype streaming applications for ERAS has obscure documentation for Linux platforms. I was able to setup the Minoru Calibration tools from a Git clone of the original vl42stereo package. I added them to the streams tree of the Telerobotics source code. It can be accessed here.

## Documentation!

The documentation underwent a major overhaul this week. In addition to commenting the code since the beginning, I ensured to update/add the following documentation pages -

The excitement of the final moments can be ascertained from my commit patterns on the last day -

Learning Experience

The past 12 weeks (and an almost equivalent time before that during application period) have been transformative.

Just to get an idea of the different tools and concepts that I've been exposed to, here's a list -

• Tango Controls
• Robot Operating System
• Blender Game Engine
• Oculus Rift
• FFmpeg
• Stereoscopic Cameras
• Video4Linux2
• Python
• OpenVPN
• Docker

That indicates a great deal of experience in terms of tools alone.

I learned how to create software architecture documents, how to work in tandem with other developers, how to communicate in the Open Source Community, when to seek help, how to seek help, how to help others, how to document my work, how to blog, and much more.

With so many things to say, here's what I must definitely acknowledge -

Thank you Python Software Foundation, Italian Mars Society, and Google Open Source Programs Office for this opportunity!

I seriously can't imagine a better way in which I could have spent the past summer. I got a chance to pursue what I wanted to do, got an amazing mentoring and umbrella organization, a fascinating group of peers to work with, and arguably the best launchpad for Open Source contributions - the Google Summer of Code.

Time for evaluations now! Fingers crossed :-)

I have maintained a weekly-updated blog since the beginning of this summer of code. My organization required the blog frequency to be one post every two weeks. I loved blogging about my progress throughout. The eighteen posts so far can be found in the GSoC Category of my website. In case you are interested in this project with the Italian Mars Society, you can follow the page of my blog

Ciao!

## August 21, 2015

#### Final commit

Phew! Most of the work was done last week. This week was spend mostly tweeking the bugs discovered by mentors. Finally done with UI of Wiki editor for both themes.

I also seperated the thirdy party plugins from the project by creating xstatic python modules: MarkitUP https://pypi.python.org/pypi?%3Aaction=pkg_edit&name=XStatic-BootstrapTagsInput https://pypi.python.org/pypi?%3Aaction=pkg_edit&name=XStatic-BootstrapTagsInput

Here is the screnshot. http://i.imgur.com/9Un3FZm.png"> http://i.imgur.com/tzJ8Uy2.png">

### Udara Piumal De Silva(MyHDL)

Since my original test had only 1454 data, I wrote a script that generate a hex file with larger number of random numbers and compare it with the results.

The python script to generate the hex file is as follows,

from intelhex import IntelHex
from random import randrange

ih = IntelHex()
for i in range(0,100000):
ih[i] = randrange(0,255)

f = open('test.hex','w')
ih.write_hex_file(f)
f.close()

Then I followed the same procedure to write/read from/to SDRAM. Upper address is used as 99,999 while lower address is used as 0. And I got all test cases pass results.
Tests Results = Passed: 100000 Failed: 0

I have created a gist with all the files I used in the procedure
https://gist.github.com/udara28/3ee43af50971a65feb71

### Sumith(SymPy)

#### GSoC - Wrapping Up

From not knowing anything considerable in programming and open source to reaching this level, has been a wonderful ride. Google Summer of Code has been full of ups and downs but none the less exhilarating.

Didn't even know at the time of my first patch that I would be so closely associated to SymEngine and the team members just a few months down the line.

After a couple of bug fixes, my first major contribution came in as the UnivariatePolynomial class. The biggest challenge here was implementing multiplication using Kronecker's trick. This was my first experience of implementing an algorithm from a paper. The UnivariatePolynomial class shaped up really well, there are minor improvements that can be made and some optimizations that could be done. But standalone, it is a fully functional class.

Once this was done, my next aim was to optimize multiplication to reach Piranha's speed. This was a very enriching period and the discussions with the team members and Francesco was a great learning experience. En route, I also got a chance to explore Piranha under the hood and trouble Francesco for reasoning why certain things were the way they. End of this, we were able to hit Piranha's speed. I remember I was the happiest I had been in days.

Once we hit the lower level speed, we decided to hard-depend on Piranha for Polynomial. This meant adding Piranha as SymEngine dependence. Here I had to learnt how to write and wrote CMake files as well as setting up Piranha testing in Travis meant writing shell and CI scripts. We faced a problem here, resolution to which meant implementing Catch as a testing framework for SymEngine. Catch is an awesome library and community is very pleasant. Implementing this was a fun work too. Also the high level value class Expression was implemented in SymEngine, mostly taken from Francesco's work.

I then started writing the Polynomial class, most of the work is done here(597). But the design is not very well thought of. I say this because once ready this can only support integer(ZZ) domain. But we will also need rational(QQ) and expression(EX). The code will be of much use but we have been discussing a much cleaner implementation with Ring class. Most of the progress and the new design decisions are being documented here.

Second half has been really rough, with the university running. Ondrej has been really patient with me, I thank him for that. The bond that I made with him through mails, technical and non technical, has really grown strong. He has allowed me to continue the work the Polynomial and implement more details and algorithms in future. I am looking forward to that as long term association is an amazing thing and I am proud to be responsible for the Polynomial module in SymEngine.

I am indebted to my mentor Ondrej Certik and all the SymEngine and SymPy developers who were ever ready to help and answer my silliest of questions. It’s an amazing community and they are really very helpful and always appreciated even the smallest of my contributions. The best part of SymEngine is you know contributors one to one and it is like a huge family of learners. I am looking forward to meeting the team (atleast SymPy India in near future).

Google Summer of Code has been one exhilarating journey. I don't know if I was a good programmer then or a good programmer now but I can say that I am a better programmer now.

This is just the beginning of the ride, GSoC a stepping stone.

There will be blog posts coming here, so stay tuned. Till then,
Bye

### Shivam Vats(SymPy)

#### And GSoC ends!

This was the last week of my Google Summer of Code project on fast series expansions for SymPy and SymEngine. It has thoroughly been an amazing experience, challenging and rewarding in more ways than I had imagined. I was extremely lucky to have such awesome mentors as Ondrej and Thilina.

Though I couldn’t achieve all that I had planned in my proposal, it taught me what I think is my biggest take away from the experience- things seldom work the way you want them to. In fact, I faced maximum difficulties in that part of my project which I had assumed to be trivial- ring series in SymPy. And as it turned out, it was the corner stone of what I had set out to do, and it needed to be done well.

Fortunately, things turned out rather well there and now most of the difficult questions with regard to ring series have been answered. ring series now has a general purpose rs_series function that expands an arbitrary SymPy expression really fast. Most of the important internal functions are also implemented now. I think as a module ring series has reached a stage where it can be helpful to people and others can help with improving and expanding it. Of course, a crazy amount of work still needs to be done and for that we need a lot of helping hands.

I have been writing a guide as well as documenting the internals in PR 9839. The importance of good documentation is another lesson I learnt during my project.

The most important thing is that people use these new capabilities. I hope more people will get involved. If all goes well, it is a potential replacement of the current series method of SymPy.

Other than that, I had a very fruitful discussion with Ondrej about how to implement polynomials and then series expansion in SymEngine. You can read the summary here. I am already excited about writing all this new stuff.

The end of GSoC is not really an end; it is a beginning, of more interesting times :)

Cheers!!

And GSoC ends! was originally published by Shivam Vats at Me on August 21, 2015.

## August 20, 2015

### Prakhar Joshi(Plone)

#### Tickling with tests

Hello everyone, So finally we have been at the end of the project and I really enjoyed each and every part of it. So after unit testing the transform its time to write functional tests and integration tests or we can say the browser tests for our add-on.

I have written the functional tests to ensure that the new add-on is imported and all the profiles are installed and the editor is using our new transform and not the old transform. So for that we have already implemented the registration of new add-on to replace it with old one and we also have to make tinyMCE uses our new transform.

How to make tinyMCE uses our new transform in place of old transfrom ?
So for understanding that we should have the idea how tinyMCE calls for the transform script. So it uses getToolByName in portal_transform to get the required transform. So tinyMCE search for safe_html in portal_transform and the old transform was named safe_html.

So here we had two ways to proceed :-
1) Either change tinyMCE configuration such that it calls for exp_safe_html instead of safe_html
2) Or we can just rename our transform same as safe_html and not some other name.

We choose the second one because lets assume the case(first one) when user just deregistered our add-on and wants to filter HTML with old transform and now lets assume we have changed the tinyMCE configuration such that it calls for exp_safe_html then it will cause error in that case and the tinyMCE will fail to create new page.

So finally we changed the name of our transform same as old one so that tinyMCE sense it like a normal thing and no change. :P

After that I tried to add my new add-on to new plone instance and tried to install that add-on in my plone site and it worked!!

Yayayay!!! So things going good and then also I tried to import the new add-on from the portal_transform and checked the terminal and it shows that all the required things are importing and then also tried to create a new page and the filtering is working perfectly.

Here is the screen shot of terminal

Its always great to get a good feedback from mentor :P

Hope you like it.

Cheers,

### Vivek Jain(pgmpy)

#### Final Notebook

pgmpy is a python library for creation, manipulation and implementation of Probabilistic graph models.There are various standard file formats for representing PGM data. PGM data basically consists of graph,a table corresponding to each node and a few other attributes of a graph.
pgmpy has a functionality to read networks from and write networks to these standard file formats.Currently pgmpy supports 5 file formats ProbModelXML, PomDPX, XMLBIF, XMLBeliefNetwork and UAI file formats.Using these modules, models can be specified in a uniform file format and readily converted to bayesian or markov model objects.
Now, Let's read a ProbModel XML File and get the corresponding model instance of the probmodel.
In [1]:
from pgmpy.readwrite import ProbModelXMLReader
In [2]:
reader_string = ProbModelXMLReader('example.pgmx')
Now to get the corresponding model instance we need get_model()
In [3]:
model = reader_string.get_model()
Now we can query this model accoring to our requirements.It is an instance of BayesianModel or MarkovModel depending on the type of the model which is given.
Suppose we want to know all the nodes in the given model, we can use
In [4]:
print(model.nodes())
['Smoker', 'X-ray', 'VisitToAsia', 'Tuberculosis', 'TuberculosisOrCancer', 'LungCancer', 'Dyspnea', 'Bronchitis']
To get all the edges use model.edges() method.
In [5]:
model.edges()
Out[5]:
[('Smoker', 'LungCancer'), ('Smoker', 'Bronchitis'), ('VisitToAsia', 'Tuberculosis'), ('Tuberculosis', 'TuberculosisOrCancer'), ('TuberculosisOrCancer', 'Dyspnea'), ('TuberculosisOrCancer', 'X-ray'), ('LungCancer', 'TuberculosisOrCancer'), ('Bronchitis', 'Dyspnea')]
To get all the cpds of the given model we can use model.get_cpds() and to get the corresponding values we can iterate over each cpd and call the corresponding get_cpd() method.
In [6]:
cpds = model.get_cpds()for cpd in cpds:    print(cpd.get_cpd())
[[ 0.95  0.05] [ 0.02  0.98]][[ 0.7  0.3] [ 0.4  0.6]][[ 0.9  0.1  0.3  0.7] [ 0.2  0.8  0.1  0.9]][[ 0.99] [ 0.01]][[ 0.5] [ 0.5]][[ 0.99  0.01] [ 0.9   0.1 ]][[ 0.99  0.01] [ 0.95  0.05]][[ 1.  0.  0.  1.] [ 0.  1.  0.  1.]]
pgmpy not only allows us to read from the specific file format but also helps us to write the given model into the specific file format. Let's write a sample model into Probmodel XML file.
For that first define our data for the model.
In [7]:
import numpy as npedges_list = [('VisitToAsia', 'Tuberculosis'),              ('LungCancer', 'TuberculosisOrCancer'),              ('Smoker', 'LungCancer'),              ('Smoker', 'Bronchitis'),              ('Tuberculosis', 'TuberculosisOrCancer'),              ('Bronchitis', 'Dyspnea'),              ('TuberculosisOrCancer', 'Dyspnea'),              ('TuberculosisOrCancer', 'X-ray')]nodes = {'Smoker': {'States': {'no': {}, 'yes': {}},                    'role': 'chance',                    'type': 'finiteStates',                    'Coordinates': {'y': '52', 'x': '568'},                    'AdditionalProperties': {'Title': 'S', 'Relevance': '7.0'}},         'Bronchitis': {'States': {'no': {}, 'yes': {}},                        'role': 'chance',                        'type': 'finiteStates',                        'Coordinates': {'y': '181', 'x': '698'},                        'AdditionalProperties': {'Title': 'B', 'Relevance': '7.0'}},         'VisitToAsia': {'States': {'no': {}, 'yes': {}},                         'role': 'chance',                         'type': 'finiteStates',                         'Coordinates': {'y': '58', 'x': '290'},                         'AdditionalProperties': {'Title': 'A', 'Relevance': '7.0'}},         'Tuberculosis': {'States': {'no': {}, 'yes': {}},                          'role': 'chance',                          'type': 'finiteStates',                          'Coordinates': {'y': '150', 'x': '201'},                          'AdditionalProperties': {'Title': 'T', 'Relevance': '7.0'}},         'X-ray': {'States': {'no': {}, 'yes': {}},                   'role': 'chance',                   'AdditionalProperties': {'Title': 'X', 'Relevance': '7.0'},                   'Coordinates': {'y': '322', 'x': '252'},                   'Comment': 'Indica si el test de rayos X ha sido positivo',                   'type': 'finiteStates'},         'Dyspnea': {'States': {'no': {}, 'yes': {}},                     'role': 'chance',                     'type': 'finiteStates',                     'Coordinates': {'y': '321', 'x': '533'},                     'AdditionalProperties': {'Title': 'D', 'Relevance': '7.0'}},         'TuberculosisOrCancer': {'States': {'no': {}, 'yes': {}},                                  'role': 'chance',                                  'type': 'finiteStates',                                  'Coordinates': {'y': '238', 'x': '336'},                                  'AdditionalProperties': {'Title': 'E', 'Relevance': '7.0'}},         'LungCancer': {'States': {'no': {}, 'yes': {}},                        'role': 'chance',                        'type': 'finiteStates',                        'Coordinates': {'y': '152', 'x': '421'},                        'AdditionalProperties': {'Title': 'L', 'Relevance': '7.0'}}}edges = {'LungCancer': {'TuberculosisOrCancer': {'directed': 'true'}},         'Smoker': {'LungCancer': {'directed': 'true'},                    'Bronchitis': {'directed': 'true'}},         'Dyspnea': {},         'X-ray': {},         'VisitToAsia': {'Tuberculosis': {'directed': 'true'}},         'TuberculosisOrCancer': {'X-ray': {'directed': 'true'},                                  'Dyspnea': {'directed': 'true'}},         'Bronchitis': {'Dyspnea': {'directed': 'true'}},         'Tuberculosis': {'TuberculosisOrCancer': {'directed': 'true'}}}cpds = [{'Values': np.array([[0.95, 0.05], [0.02, 0.98]]),         'Variables': {'X-ray': ['TuberculosisOrCancer']}},        {'Values': np.array([[0.7, 0.3], [0.4,  0.6]]),         'Variables': {'Bronchitis': ['Smoker']}},        {'Values':  np.array([[0.9, 0.1,  0.3,  0.7], [0.2,  0.8,  0.1,  0.9]]),         'Variables': {'Dyspnea': ['TuberculosisOrCancer', 'Bronchitis']}},        {'Values': np.array([[0.99], [0.01]]),         'Variables': {'VisitToAsia': []}},        {'Values': np.array([[0.5], [0.5]]),         'Variables': {'Smoker': []}},        {'Values': np.array([[0.99, 0.01], [0.9, 0.1]]),         'Variables': {'LungCancer': ['Smoker']}},        {'Values': np.array([[0.99, 0.01], [0.95, 0.05]]),         'Variables': {'Tuberculosis': ['VisitToAsia']}},        {'Values': np.array([[1, 0, 0, 1], [0, 1, 0, 1]]),         'Variables': {'TuberculosisOrCancer': ['LungCancer', 'Tuberculosis']}}]
Now let's create a model from the given data.
In [8]:
from pgmpy.models import BayesianModelfrom pgmpy.factors import TabularCPDmodel = BayesianModel(edges_list)for node in nodes:    model.node[node] = nodes[node]for edge in edges:    model.edge[edge] = edges[edge]tabular_cpds = []for cpd in cpds:    var = list(cpd['Variables'].keys())[0]    evidence = cpd['Variables'][var]    values = cpd['Values']    states = len(nodes[var]['States'])    evidence_card = [len(nodes[evidence_var]['States'])                     for evidence_var in evidence]    tabular_cpds.append(        TabularCPD(var, states, values, evidence, evidence_card))model.add_cpds(*tabular_cpds)
In [9]:
from pgmpy.readwrite import ProbModelXMLWriter, get_probmodel_data
To get the data which we need to give to the ProbModelXMLWriter to get the corresponding fileformat we need to use the method get_probmodel_data. This method is only specific to ProbModelXML file, for other file formats we would directly pass the model to the given Writer Class.
In [10]:
model_data = get_probmodel_data(model)writer = ProbModelXMLWriter(model_data=model_data)print(writer)

To write the xml data into the file we can use the method write_file of the given Writer class.
In [ ]:
writer.write_file('probmodelxml.pgmx')

## General WorkFlow of the readwrite module¶

pgmpy.readwrite.[fileformat]reader is base class for reading the given file format. Replace file fomat with the desired fileforamt from which you want to read the file.In this base class there are different methods defined to parse the given file.For example for XMLBelief Network various methods which are defined are as follows.
In [4]:
from pgmpy.readwrite.XMLBeliefNetwork import XBNReaderreader = XBNReader('xmlbelief.xml')
get_analysisnotebook_values: It returns a dictionary of the attributes of analysisnotebook tag.
In [5]:
reader.get_analysisnotebook_values()
Out[5]:
{'NAME': 'Notebook.Cancer Example From Neapolitan', 'ROOT': 'Cancer'}
get_bnmodel_name: It returns the name of the bnmodel.
In [6]:
reader.get_bnmodel_name()
Out[6]:
'Cancer'
get_static_properties: It returns the dictionary of staticproperties.
In [7]:
reader.get_static_properties()
Out[7]:
{'CREATOR': 'Microsoft Research DTAS', 'FORMAT': 'MSR DTAS XML', 'VERSION': '0.2'}
get_variables: It returns the list of variables.
In [8]:
reader.get_variables()
Out[8]:
{'a': {'DESCRIPTION': '(a) Metastatic Cancer',  'STATES': ['Present', 'Absent'],  'TYPE': 'discrete',  'XPOS': '13495',  'YPOS': '10465'}, 'b': {'DESCRIPTION': '(b) Serum Calcium Increase',  'STATES': ['Present', 'Absent'],  'TYPE': 'discrete',  'XPOS': '11290',  'YPOS': '11965'}, 'c': {'DESCRIPTION': '(c) Brain Tumor',  'STATES': ['Present', 'Absent'],  'TYPE': 'discrete',  'XPOS': '15250',  'YPOS': '11935'}, 'd': {'DESCRIPTION': '(d) Coma',  'STATES': ['Present', 'Absent'],  'TYPE': 'discrete',  'XPOS': '13960',  'YPOS': '12985'}, 'e': {'DESCRIPTION': '(e) Papilledema',  'STATES': ['Present', 'Absent'],  'TYPE': 'discrete',  'XPOS': '17305',  'YPOS': '13240'}}
get_edges: It returs the list of tuples.Each tuple containes two elements (parent, child) for each edge.
In [9]:
reader.get_edges()
Out[9]:
[('a', 'b'), ('a', 'c'), ('b', 'd'), ('c', 'd'), ('c', 'e')]
get_distributions: It returns a dictionary of name and it's distributions.
In [10]:
reader.get_distributions()
Out[10]:
{'a': {'DPIS': array([[ 0.2,  0.8]]), 'TYPE': 'discrete'}, 'b': {'CARDINALITY': array([2]),  'CONDSET': ['a'],  'DPIS': array([[ 0.8,  0.2],         [ 0.2,  0.8]]),  'TYPE': 'discrete'}, 'c': {'CARDINALITY': array([2]),  'CONDSET': ['a'],  'DPIS': array([[ 0.2 ,  0.8 ],         [ 0.05,  0.95]]),  'TYPE': 'discrete'}, 'd': {'CARDINALITY': array([2, 2]),  'CONDSET': ['b', 'c'],  'DPIS': array([[ 0.8 ,  0.2 ],         [ 0.9 ,  0.1 ],         [ 0.7 ,  0.3 ],         [ 0.05,  0.95]]),  'TYPE': 'discrete'}, 'e': {'CARDINALITY': array([2]),  'CONDSET': ['c'],  'DPIS': array([[ 0.8,  0.2],         [ 0.6,  0.4]]),  'TYPE': 'discrete'}}
get_model: It returns an instance of the given model, for ex, BayesianModel in cases of XMLBelief format.
In [11]:
model = reader.get_model()print(model.nodes())print(model.edges())
['c', 'b', 'e', 'a', 'd'][('c', 'e'), ('c', 'd'), ('b', 'd'), ('a', 'c'), ('a', 'b')]
pgmpy.readwrite.[fileformat]writer is base class for writing the model into the given file format.It takes a model as an argument which can be an instance of BayesianModel, MarkovModel. Replace file fomat with the desired fileforamt from which you want to read the file.In this base class there are different methods defined to set the contents of the new file to be created from the given model.For example for XMLBelief Network various methods which are defined are as follows.
In [7]:
from pgmpy.models import BayesianModelfrom pgmpy.factors import TabularCPDimport numpy as npnodes = {'c': {'STATES': ['Present', 'Absent'],               'DESCRIPTION': '(c) Brain Tumor',               'YPOS': '11935',               'XPOS': '15250',               'TYPE': 'discrete'},         'a': {'STATES': ['Present', 'Absent'],               'DESCRIPTION': '(a) Metastatic Cancer',               'YPOS': '10465',               'XPOS': '13495',               'TYPE': 'discrete'},         'b': {'STATES': ['Present', 'Absent'],               'DESCRIPTION': '(b) Serum Calcium Increase',               'YPOS': '11965',               'XPOS': '11290',               'TYPE': 'discrete'},         'e': {'STATES': ['Present', 'Absent'],               'DESCRIPTION': '(e) Papilledema',               'YPOS': '13240',               'XPOS': '17305',               'TYPE': 'discrete'},         'd': {'STATES': ['Present', 'Absent'],               'DESCRIPTION': '(d) Coma',               'YPOS': '12985',               'XPOS': '13960',               'TYPE': 'discrete'}}model = BayesianModel([('b', 'd'), ('a', 'b'), ('a', 'c'), ('c', 'd'), ('c', 'e')])cpd_distribution = {'a': {'TYPE': 'discrete', 'DPIS': np.array([[0.2, 0.8]])},                    'e': {'TYPE': 'discrete', 'DPIS': np.array([[0.8, 0.2],                                                                [0.6, 0.4]]), 'CONDSET': ['c'], 'CARDINALITY': [2]},                    'b': {'TYPE': 'discrete', 'DPIS': np.array([[0.8, 0.2],                                                                [0.2, 0.8]]), 'CONDSET': ['a'], 'CARDINALITY': [2]},                    'c': {'TYPE': 'discrete', 'DPIS': np.array([[0.2, 0.8],                                                                [0.05, 0.95]]), 'CONDSET': ['a'], 'CARDINALITY': [2]},                    'd': {'TYPE': 'discrete', 'DPIS': np.array([[0.8, 0.2],                                                                [0.9, 0.1],                                                                [0.7, 0.3],                                                                [0.05, 0.95]]), 'CONDSET': ['b', 'c'], 'CARDINALITY': [2, 2]}}tabular_cpds = []for var, values in cpd_distribution.items():    evidence = values['CONDSET'] if 'CONDSET' in values else []    cpd = values['DPIS']    evidence_card = values['CARDINALITY'] if 'CARDINALITY' in values else []    states = nodes[var]['STATES']    cpd = TabularCPD(var, len(states), cpd,                     evidence=evidence,                     evidence_card=evidence_card)    tabular_cpds.append(cpd)model.add_cpds(*tabular_cpds)for var, properties in nodes.items():    model.node[var] = properties
In [8]:
from pgmpy.readwrite.XMLBeliefNetwork import XBNWriterwriter = XBNWriter(model = model)
set_analysisnotebook: It sets the attributes for ANALYSISNOTEBOOK tag.
set_bnmodel_name: It sets the name of the BNMODEL.
set_static_properties: It sets the STAICPROPERTIES tag for the network.
set_variables: It sets the VARIABLES tag for the network.
set_edges: It sets edges/arcs in the network.
set_distributions: It sets distributions in the network.

### Sartaj Singh(SymPy)

#### GSoC: Update Week-10, 11 and 12

This is the 12th week. Hard deadline is this Friday. GSoC is coming to an end leaving behind a wonderful experience. Well here's how my past few weeks went.

### Highlights:

Work on Formal Power Series:

• #9776 added the fps method in Expr class. Instead of fps(sin(x)), user can now simply do sin(x).fps().
• #9782 implements some basic operations like addition, subtraction on FormalPowerSeries. The review is almost complete and should get merged soon.
• #9783 added the sphinx docs for the series.formal module.
• #9789 replaced all the solve calls in the series.formal with the new solveset function.

Work on computing limits of sequences:

This is the second part of my GSoC project aiming to implement the algorithm for computing limits of sequences as described in the poster Computing Limits Of Sequences by Manuel Kauers.

• #9803 implemented the difference_delta function. difference_delta(a(n), n) is defined as a(n + 1) - a(n). It is the discrete analogous of differentiation.
• #9836 aims at completing the implementation of the algorithm. It is still under review and hopefully it will be in soon.

Get #9782 and #9836 merged soon.

### Upcoming:

A thank you post ;)

## August 19, 2015

### Shridhar Mishra(ERAS Project)

#### Finals

The final model of the project is in place and the Europa planner is working the way its supposed to be.
The code in the repository is in a working condition and has the default NDDL plan on it which moves the rover from the base to a rock and collect the sample.
Integration with the Husky rover s underway and the code is being wrapped up for the final submission.

Shridhar

### Michael Mueller(Astropy)

#### Week 12

Since GSOC is finally wrapping up, I pretty much spent this week reading over code in the PR and writing documentation. I introduced a new section on table indexing in the docs after the "Table operations" section, which should give a good introduction to indexing functionality. It also links to an IPython notebook I wrote (http://nbviewer.ipython.org/github/mdmueller/astropy-notebooks/blob/master/table/indexing-profiling.ipynb) that displays some profiling results of indexing by comparing different scenarios, e.g. testing different engines and using regular columns vs. mixins. I also ran the asv benchmarking tool on features relevant to indexing, and fixed an issue with Table sorting in which performance was slowed down while sorting a primary index.

There's not much else to describe in terms of final changes, although I do worry about areas where index copying or relabeling come up unexpectedly and have a negative effect on performance. As an example, using the loc attribute on Table is very slow for an indexing engine like FastRBT (which is slow to copy), since the returned rows of the Table are retrieved via a slice that relabels indices. This is necessary if the user wants indices in the returned slice, but I doubt that's usually a real issue. I guess the two alternatives here are either to have loc return something else (like a non-indexed slice) or to simply advise in the documentation that using the index mode 'discard_on_copy' is appropriate in such a scenario.

## August 18, 2015

### Sahil Shekhawat(PyDy)

#### GSoC Final Week

Gsoc is close to the end and just one week is remaining. Soft pencil down date has passed and I, too, have finished all of the main implementation of Body, Abstract Joint, PinJoint, SlidingJoint, CylindricalJoint, PlanarJoint, SphericalJoint and JointMethod classes. I have raise the PRs for all of the classes.

### Andres Vargas Gonzalez(Kivy)

#### Nrecognizer kivy implementation ### Aman Singh(Scikit-image) #### Normal errors while writing a C++ code Here I am compiling some of the most frequently repeated errors which many people do while doing coding in c++: 1. Placing break statement at wrong place. Remember break should always be placed as the last instruction, because loop terminates after that. 2. Conditions in for statement: This is one of the most common errors people do. This could be of many types as using same variable to iterate in both inner and outer loops of nested loops. Another case may be of using wrong increment or irrelevant loop breaking conditions. Using aligned code and brackets can prevent many of the errors. 3. Segmentation fault in while loop: Many a times we have to conditionally increment a variable in a while loop. But some times we apply such condition that the iterator is not incremented. This makes the loop run infinite times and thus causing stack overflow. Updating a loop variable of while loop is s must condition and we must check it for all cases in the while loop. There are also cases when we start loop and don’t write the loop variable updating condition. This can be corrected by making it a habit to put the last line of while loop as loop variable updater. ## August 16, 2015 ### Rafael Neto Henriques(Dipy) #### [RNH post #13] Start wrapping up - Test singularities of kurtosis statistics. As we are reaching the end of the GSoC coding period, I am starting to wrap up the code that I developed this summer. When reviewing the code implementing the kurtosis standard statistics, I detected some problems on the performance of the analytical solution of the mean kurtosis function. In this post, I am reporting how I overcome these issues! This post is extremely relevant for whom is interested in knowing the full details of the implementation of kurtosis standard measures. ## Problematic performance near to function singularities As I mention in previous posts, the function to compute the mean kurtosis was implemented according to an analytical solution proposed by Tabesh et al., 2011. The mathematical formulas of this analytical solution have some singularities, particularly for cases that the diffusion tensor has equal eigenvalues. To illustrate theses singularities, I am plotting below the diffusion and kurtosis tensors of crossing fiber simulates with different intersection angles. Simulates were preformed based on the modules implemented during the GSoC coding period.  Figure 1 - Diffusion tensor (upper panels) and kurtosis tensors (lower panels) for crossing fibers intersecting at different angles (the ground truth fiber direction are shown in red). The values of the eigenvalues of the diffusion tensor as function of the intersection angle are shown below.  Figure 2 - Diffusion eigenvalues as function of crossing fibers intersection angle. First eigenvalues is plotted in red while the second and third are plotted in green and blue. From the figure above, we can detect two problematic cases for the MK analytical solution: 1. When intersection angle is zero (i.e. when the two fibers are aligned), the second diffusion eigenvalue is equal to the third eigenvalue. 2. When the intersection angle is 90 degrees, the first diffusion eigenvalue is equal to the second eigenvalue. Based on the work done by Tabesh et al., 2011, these MK estimation singularities can be mathematically revolved by detecting the problematic cases and using specific formulas for each detected situation. In the previous version of my codes, I was detecting the cases that two or three eigenvalues were equal by analysis if their differences were three orders of magnitude larger than system's epslon. For example, to automatically check if the first eigenvalue equals the second eigenvalue, the following lines of code were used: import numpy as np er = np.finfo(L1.ravel()[0]).eps * 1e3 cond1 = (abs(L1 - L2) < er) Although, my testing modules were showing that this procedure was successfully solving the singularities for eigenvalue differences three orders of magnitude smaller than the system's epslon, when plotting MK as function of the intersection angle, some unexpected underestimated were present on the regions near to the singularities (see the figures below).  Figure 3 - MK values as function of the crossing angle. The blue line shows the MK values estimated from the analytical solution while the red line show the MK values estimated from a numerical method described in previous posts.  Figure 4 - MK values as function of the crossing angle (range between 85 and 90 degrees). The blue line shows the MK values estimated from the analytical solution while the red line show the MK values estimated from a numerical method described in previous posts. This figure was produced for a better visualization of the underestimations still present near to the crossing angle of 90 degrees. After some analysis, I noticed that MK underestimations were still present if eigenvalues were not 2% different to each other. Given this, I was able to solve this underestimation by adjusting the criteria of eigenvalue comparison. As example, to compare the first eigenvalue with the second, the following lines of code are now used: er = 2.5e-2 # difference (in %) between two eigenvalues to be considered as different cond1 = (abs((L1 - L2) / L1) > er) Below, I am showing the new MK estimates as function of the crossing angle, where all underestimations seem to be corrected. Moreover, discontinuities on the limits between the problematic and the non-problematic eigenvalue regime are relatively small. The most significant differences are now between different MK estimation methods (for details on the difference between these methods revisit post #9).  Figure 5 - Corrected MK values as function of the crossing angle. The blue line shows the MK values estimated from the analytical solution while the red line show the MK values estimated from the numerical method described in previous posts. ### Rupak Kumar Das(SunPy) #### End times Hello everyone! Since the coding period is about to end and I was to post this update last week but forgot so let me give a quick update. The Save support PR has been merged and closed. I have been working on the Slit, Line Profile and Intensity Scaling plugins, each of which is nearly complete with a few small bugs remaining. Those along with the documentation is what needs to be done in the next few days of the coding period. Although not required for the project, there are a couple of things that could not be completed which I have decided to work on after the period. See you next week (the final post)! ### Ambar Mehrotra(ERAS Project) #### GSoC 2015: 7th Biweekly Report Hello everyone, the last two weeks were quite hectic and I spent most of my time trying to figure out how to use the PANIC api for integrating alarms with the Aouda server. I also worked on the graph portion of the GUI and made some changes that were necessary. Graphs: A user can now specify the graph updation frequency as well as the maximum number of values that can be shown in the graph window at a given point of time. This was a necessary change as different device servers can have different sampling rates for different attributes. These values are then stored in a config file and the user can later edit the config file to adjust the graph updation time as well as the number of values showing up in the graph window. Panic API: I haven't been able to achieve much in this section because I am finding it really difficult to explore the PANIC api. The documentation is lacking and simple examples have not been working out of the box. I have been trying to do as much research here as possible and have also asked some questions on their forum. I am just waiting for them to respond so that I can move forward in this area, Documentation: Since the coding period is about to end I am mostly focusing on documenting the work as much as possible and cleaning my code. I will mostly be doing this in the following week and try to work on the PANIC integration if I get any responses on the PANIC community. Happy Coding. ### Sumith(SymPy) #### GSoC Progress - Week 10 and 11 Hello all. Here are the most recent developments in the Polynomial wrappers. ### Report • The Polynomial wrappers was using piranha::hash_set as the Polynomial wrappers, hence when there was no Piranha as a dependence, the Polynomial wouldn't compile. The fix to this was to use std::unordered_set with -DWITH_PIRANHA=no so that there would be atleast a slow version available. • Another issue was Travis testing of Polynomial. Since we depend on Piranha, we had to setup Travis testing with Piranha included and Polynomial tests run. This was done in the merged PR 585. • Before we get the Polynomial merged we have to add mul_poly, improve printing, and test exhaustively. The mul_poly is ready here, will be merged once more tests are prepared. For mul_poly, previously we never checked the variables corresponding to the hash_sets, which implies you could only multiply a n variable polynomial with another n variable polynomial with the variable symbols same in both. When the variables of two hash_sets are different, a work around would be needed. This would result in slow down if done directly. As suggested by Ondřej, mul_poly now calls two functions _normalize_mul and _mul_hashest. Here _noramlize_mul sees to it that the hash_sets satisfy the afore mentioned criteria and then _mul_hashset operates For example, say mul_poly is called, then _normalize_mul converts {1, 2, 3} of x, y, z and {4, 5, 6} of p, q, r to {1, 2, 3, 0, 0, 0} and {0, 0, 0, 4, 5, 6} and _mul_hashset multiplies the two hash_set. The speed of benchmarks determined by _mul_hashset. • The printing needs improvement. As of now the polynomial 2*x + 2*y gets printed as 2*y**1*x**0 + 2*y**0*x**1. • Not all that was planned could be completed this summers, mostly because of my hectic schedule after the vacations ended and institure began. I am planning to work after the program ends too, when the workload eases. As the final deadline week of GSoC is coming up, I need to ensure at least the PRs on hold gets merged.I am planning to continue after the period ends so as That's all I have See ya ## August 15, 2015 ### Nikolay Mayorov(SciPy) #### Linear Least Squares with Bounds Hi! The GSoC is coming to an end so this will be the last technical post. I am a bit tired of all this, so it will be short. We decided that the last contribution will be a solver for linear least squares with bounds, i.e. we consider the problem: $\frac{1}{2} \lVert A x - b \rVert^2 \rightarrow \min \limits_x \text{ s. t. } l \le x \le u$, where $A$ and $b$ are given matrix and vector. This is a convex optimization problem, thus it is very well posed. Turns out though, approaches to solve it are not much simpler than for a nonlinear problem, but the convergence is generally better. My implementation https://github.com/scipy/scipy/pull/5110 contains 2 methods: 1. Adaptation of Trust Region Reflective algorithm I used for nonlinear solver. The difference is that a quadratic model is always accurate in linear least squares, hence we don’t need to track or adjust a radius of a trust region assuming it is big enough for full Gauss-Newton steps (+ some precautions were added). Initially I tried to use other method called “reflective Newton“, but it didn’t converge well on some problems. I didn’t understand what is canonical implementation of this method, but got a feeling that it is not well designed for practical use. Note that MATLAB’s version doesn’t perform that well either (from my limited experience). On the other hand, I haven’t found difficult problems for my TRF adaptation. 2. Classical Bounded-Variable Least Squares as described in the paper of Stark and Parker. This is an active-set method which optimally separates variables in free and active by the intelligent inclusion-exclusion procedure. The strong point of this algorithm is that eventually it unambiguously determines the optimal solution. But the number of iterations required can easily be on the order of the number of variables (and iterations are heavy), which restricts its usage for large problems. For small problems this method is very good. Also I added a self-invented ad-hoc procedure for method’s initialization, which should (hopefully) decrease the number of iterations done by BVLS. I think that’s it for today. The next post will be final on GSoC. ### Isuru Fernando(SymPy) #### GSoC week 10 and 11 symengine-0.1.0 beta version was released this week and these two weeks were spent on making sure symengine works without a problem on Sage. One issue was the linking of the python libraries in Sage. In binary releases of sage, the variable distutils.sysconfig.get_config_var('LIBDIR') is wrong. It is set to the build machine's location. In Windows this is set to empty. Earlier, to link the python libraries into the python wrappers, python library was found using the above variable, but in some cases like Sage and Windows this method fails. To fix this, CMake now looks in sys.prefix/libs and sys.prefix/lib as well to find the python libraries. Another issue that came up was cmake generating bad link flags. When installing in Sage, it is important to make sure the libraries in Sage are linked and not the system wide libraries. To do that libraries were searched for in the sage directories ignoring the system wide libraries. When given the full path of the libraries to link, we noticed a strange behaviour. /path/to/sage/local/lib/libgmp.so was changed to -lgmp causing the linker to pick up the system-wide gmp library. After reading through CMake documentation, I realized that this was due to find_library giving wrong locations of system libraries where there are multiple libraries of the same name for different architectures. For example if the output of find_library(math NAMES m) was given to find the standard math library, it may find a libm.so that was intended for a different architecture. Therefore when cmake realizes that the library being linked to is a system library then the full path is converted to -lm to delegate the task to the linker to find the correct library. This behaviour is useful for some scenarios, but in our case, this was not the behaviour I needed. Fortunately there was a workaround for this mentioned in the documentation. Using IMPORTED target feature in CMake, I was able to get CMake to use the full path of the library. R7 and R8 benchmarks from symbench benchmarks of sage were added to benchmark SymEngine-C++ against GiNaC and also SymEngine-Python against SymPy and Sage. ### Shivam Vats(SymPy) #### GSoC Week 12 Last week I told you why rs_series doesn’t work with negative or fractional powers because of the constraints of a polynomial back-end and why we need to modify polys. The situation isn’t that hopeless actually. Let’s talk about negative and fractional powers one by one. ### Negative Powers The reason negative exponents work in ring_series is because I modified PolyElement to allow so. In hind sight, it wasn’t the right decision and needs to be replaced with something that doesn’t alter polys. It is rather surprising that I came across a possible solution so late (Now we know why good documentation is so important). I already knew that polys allows us to create a FractionField. A fraction field over a domain R consists of elements of the form a/b where a and b belong to R. In our case we are interested in the fraction field of polynomial ring, i.e, fractions with polynomials as numerator and denominator. So a/bis not a * b**(-1) but is a / b, where a and b are polynomials. What was new to me was that just like ring, polys also has sparse field. In effect, it allows us to create sparse rational functions without altering anything. I modified some functions in ring_series to work with a rational function field here, and it works quite well indeed. In [42]: from sympy.polys.fields import * In [43]: F, a, b = field('a, b', QQ) In [44]: p = rs_sin(a + b, a, 3)/a**5 In [45]: p*a Out[45]: (-3*a**2*b - 3*a*b**2 + 6*a - b**3 + 6*b)/(6*a**4)  Note that all these are field operations and I haven’t modified field.py in any way. Elegant! But then again, having a field increases the complexity as we need to evaluate the numerator and denominator separately. ### Fractional Powers Fractional powers are a much trickier case as there is no simple solution to it as above. What we can do is optimise the option I had presented in my last post, i.e, have each fractional power as a generator. But doing that opens up a Pandora’s box. Simple things such as sqrt(a)**2 == a do not hold true any more. The current rs_series treats sqrt(a) as a constant if we are expanding with respect to a: In [22]: rs_series(sin(a**QQ(1,2)), a**QQ(1,2),5) Out[22]: -1/6*(sqrt(a))**3 + (sqrt(a)) In [23]: rs_series(sin(a**QQ(1,2)), a,5) Out[23]: sin(sqrt(a))  So, if we indeed decide to tread this path, we would need to replace a here with sqrt(a)**2. This really complicates the situation as we need to figure out what to replace with. In any calculation the powers change multiple times and each time we’ll need to figure out how to rewrite the series. ### Next Week It is now mostly a design decision whether we want the ring_series to be confined within the polys module. The polys environment allows efficient manipulations of laurent series (with FracField), but I doubt we can achieve the speed we want with puiseux series without modifying polys. One possible solution is to separate the modified parts of polys along with ring_series from polys. We are using polys only because it has the data structure that we want. Separating them would allow us to simultaneously make use of its back-end and not introduce unnecessary complexity in our representation. Other than that, documentation is another priority now. I had planned to do it earlier too, but couldn’t. This week’s discovery has reminded me of its importance. Cheers! GSoC Week 12 was originally published by Shivam Vats at Me on August 15, 2015. ## August 14, 2015 ### Siddhant Shrivastava(ERAS Project) #### Telerobotics - The Penultimate Crescendo Hi! As the hard-deadline date for the Google Summer of Code program draws to a close, I can feel the palpable tension that is shared by my mentors and fellow students at the Italian Mars Society and the Python Software Foundation. # All-Hands Meeting We at the Italian Mars Society had the third all-hands meeting last evening (13th August). The almost two-hour Skype Conference call discussed a gamut of topics in-depth. Some of these were- ## Software Testing guidelines Ezio described the various ways of Unit Testing in different applications like rover movements, bodytracking, etc. In my case I had been checking for setup prerequisites and establishing the serializability of the ROS system before other modules could start up. That way the it is successfully ensured that all the required distributed systems are up and running before they are used. Integration Testing is crucial in the ERAS application where things like Telerobotics, Bodytracking, and the EUROPA Planner all blend together seamlessly. I've integrated Telerobotics and Bodytracking which can be observed in this commit. ## Telerobotics Telerobotics in its current state is more precise than ever. This video demonstrates this fact - The YouTube link for the video is this. I improved upon the previous integration with Bodytracking and handled the possible exceptions that may occur. The results have been stunning. I used the updated version of Vito's bodytracker which can detect closed hands. Since the sensor refresh-rate has been reduced to 30 times per second, the Telerobotics module has much smoother movements. Here is a snapshot of the Bodytracking application running in a Windows Virtual Machine - ## EUROPA Planner and Navigation Integration Shridhar has been working on the Planner which outputs Cartesian coordinates in the format (x,y) to which the rover must navigate. I am using the AMCL navigation algorithm for known maps in addition to the actionlib server of ROS to facilitate this integration. The challenge here is to resolve between the Cartesian coordinates of EUROPA and that of ROS. This should be hopefully complete in the next couple of days. ## AMADEE15 mission Yuval described that the the recently concluded mission was a huge success which focused on the following frontiers- • GPS integration with Blender • Photogrammetry to reproduce Blender scenes for Virtual EVAs. • Unity3D and Oculus Integration • AoudaX realtime software • Generic ERAS Data Logger • Husky navigation Franco explained in brief about the Neuro-vestibular and Husky Scientific experiments. ## Other things Final efforts with Docker - After a lot of success, I have just one gripe with Docker. Running the Gazebo simulator, rviz (ROS visualizer), and the Telerobotics module requires THREE terminals.Working with ROS as a master inherently requires access to a lot of terminals for logging, echoing topic messages, starting programs, etc. The current ways to achieve multiple terminals and Qt applications in Docker are at best makeshift workarounds. To handle a graphics-heavy application like Telerobotics, we require a Graphical Environment. Docker is great for providing a common service framework but not so good at graphical applications like ROS. That's why I have been unable to get Docker working with the graphical aspects of ROS. # Documentation In the final leg of the program, it is vital to go all-guns-blazing with the documentation of the software work that the students do. This is to ensure future development, maintainability, and clarity of thought. I recently added instructions in the Documentation directory - telerobotics/doc/ to replicate my setup. This can be found in my current commit. I am ensuring that my mentors would be able to replicate my setup and give feedback very soon. The last week of GSoC is quite frenzied with the action to produce a consistent wrap-up of the project. The next post will officially be the last post of my GSoC 2015 experience. In reality, of course, I would keep working on the project and keep blogging :) Till then, ciao. ### Rafael Neto Henriques(Dipy) #### [RNH post #12] Attempt to further improve the diffusion standard statistics The denoising strategy that I used to improve the diffusion standard statistics (see my last post), required the estimation of the noise standard deviation (sigma). As a first approach, I used a simple sigma estimation procedure that was specifically developed for T1-weighted images. Thus, this might not be the most adequate approach for diffusion-weighted images. Particularly, I noticed that sigma estimates had a dependency on the b-values (smaller b-values were related to higher sigma). Example of computed sigma for given b-values are shown bellow: • b-value = 0 => sigma around 810 • b-value = 200 => sigma around 510 • b-value = 400 => sigma around 390 • b-value = 1000 => sigma around 268 • b-value = 2000 => sigma around 175 Comparing the original diffusion-weighted images with the denoised versions, I notice that, for the smaller b-values, some image texture was present when computing the difference between original and denoised version of the image. This suggests that sigma values for smaller b-values are overestimated.  Figure 1. - Diffusion-weighted image with b-values set to 0. Left panel shows the image before being denoised while the middle panel shows the denoised image. The difference between both images is shown in left. Some image structure can be identified on the image difference, which suggest that important information is being removed on the denoising process.  Figure 2. - Diffusion-weighted image with b-values set to 2000. Left panels show the image before being denoised while the middle panels shows the denoised image. The difference between both images is shown in left. Brain structure is not significantly identified on the image difference. ## Piesno Given the issue mentioned above, I tried to replace the noise estimation procedure with a technique specifically developed for diffusion-weighted images - a technique called piesno. This technique can be imported and used from dipy using the follow commands: from dipy.denoise.noise_estimate import piesno sigma, background_mask = piesno(data, N=4, return_mask=True) The noise standard given by piesno for all axial images was around 156. As expected this values is smaller than the previous sigma estimates suggesting that these were indeed overestimated. Despite this value seems to be the most accurate estimate for the denoising procedure, I noticed that only a small amount of background voxels, used to compute sigma, was automatically detected by piesno.  Figure 3 - Background voxels detected by piesno. These voxels were the ones used to estimate the noise standard deviation. Computing again the difference between the original and denoised version of the data. I also notice that the denoising procedure preformance was still dependent on the b-value. In particularly, for a b-value=0 the procedure seems only to denoise the middle of the image. Since sigma was maintained constant, this dependency with the b-value seem to be caused by the denoising algorithm itself.  Figure 4. - Diffusion-weighted image with b-values set to 0. Left panels shows the image before being denoised while the middle panels shows the denoised image. Noise estimation for the denoising procedure is now done using piesno. The difference between both images is shown in left. Some image structure can be identified on the image difference, which suggest that important information is being removed on the denoising process.  Figure 5. - Diffusion-weighted image with b-values set to 2000. Left panels shows the image before being denoised while the middle panels shows the denoised image. Noise estimation for the denoising procedure is now done using piesno. The difference between both images is shown in left. Brain structure is not significantly identified on the image difference. Below are the final versions of the kurtosis standard measures obtain after adjusting the sigma of the denoising procedure:  Figure 6 - Real brain parameter maps of the mean kurtosis (MK), axial kurtosis (AK), and radial kurtosis (RK) obtain from a HCP-like dataset using the DKI module. These are the maps specific to DKI. The dataset for these reconstructions was kindly supplied by Valabregue Romain, CENIR, ICM, Paris. Noise artefacts are present when piesno is used, therefore for the DKI reconstruction I decided to keep the previous denoising approach as default. ### Udara Piumal De Silva(MyHDL) #### Detailed steps for Hardware Verification of MySdramCntl.py This post is a detailed guide for hardware verifying the MySdramCntl.py in Xula2 board. #### 1.) Clone and setup the SDRAM_Controller repository The repository is available at https://github.com/udara28/SDRAM_Controller Only requirements for using SDARM_Controller is python and myhdl. Installation guide for myhdl can be found at https://github.com/udara28/SDRAM_Controller #### 2.) Testing [Optional] Repository contains two tests for testing the sdram model and the controller. These can be run using the following commands, python test_sdram.py python test_controller.py If no error messages were shown and the output prints the written value properly, tests have passed. #### 3.) Converting to VHDL Use the following command to generate MySdramCntl.vhd and pck_myhdl_<myhdl_ver_no_>.vhd files. python Conversion.py This would also generate the MySdramCntl.v but for this time we don't need it. #### 4.) Install XSTOOLs Since we are using the Xula2 board we need to install the tools used for interact with the board. It is required we build and install these tools from source because later we are modifying the source a little. XSTOOLs repository can be fount at https://github.com/xesscorp/XSTOOLs Install the tools using sudo make install command. You can then start gxstools and use it to interact with Xula2 board. ( However I faced several issues when trying to use gxstools from the repository. I have a forked repository of XSTOOLs where I fixed the issues I faced. It can be found at https://github.com/udara28/XSTOOLs Check-out branch fix_thread_bug and running sudo make install worked for me ) After cloning the repository you would find a file XSTOOLs/xstools/xula2/ramintfc_jtag_lx25.bit When we are using gxstools this file will be written to the FPGA to interact with the built-in sdram. We would modify this file so that it uses our Sdram Controller instead of the default Controller. Source to generate this file is in the repository XuLA2 #### 5.) Clone and setup XuLA2 XuLA2 repository can be found at https://github.com/xesscorp/XuLA2 However to compile XuLA2 we also require the VHDL_Lib repository. VHDL_Lib repository can be found at https://github.com/xesscorp/VHDL_Lib You need Xilinx ISE installed in your computer to compile and generate the bit files from these repository. We are interested in XuLA2/FPGA/ramintfc_jtag folder which produces the bit file I mentioned above. Open the ISE project using Xilinx ISE by running the command ise ramintfc_jtag.xise #### 6.) Add MySdramCntl.vhd and pck_myhdl_10.vhd to the project Go to the files tab of the project and add the two new files MySdramCntl.vhd and pck_myhdl_10.vhd #### 7.) Modify the ramintfc_jtag.vhd The file ramintfc_jtag.vhd creates the bit file I mentioned above. So lets change this file to use MySdramCntl.vhd you can find the modified file at https://gist.github.com/udara28/4a751e68508e37f28a55 Replace the ramintfc_jtag.vhd with the modified ramintfc_jtag.vhd The modification replaces the U4 : SdramCntl with U4 : MySdramCntl Synthesize and generate the bit file from ISE. You can verify that the changes are applied properly by looking at the RTL view. It should look as follows, #### 8.) make After the modifications we can run make within the ramintfc_jtag folder. However therewill be some errors. You can can fix them simply by renaming the following files, mv ramintfc_jtag_LX9.xst ramintfc_jtag_lx9.xst mv ramintfc_jtag_LX25.xst ramintfc_jtag_lx25.xst After those changes make will run smoothly and create the ramintfc_jtag_lx25.bit file which we need. #### 9.) Replace the ramintfc_jtag_lx25.bit Replace the XSTOOLs/xstools/xula2/ramintfc_jtag_lx25.bit file with the bit file we just created and run sudo make install from the XSTOOLs directory. #### 10.) Write and Read from the sdram We can now start gxstools and write some data and read it back to a file. For that I used the sample_input.hex file at https://gist.github.com/udara28/4a751e68508e37f28a55 If you have IntelHex python module installed you can use it to look at the data within this file using a simple python script from intelhex import IntelHex sample_in = IntelHex('sample_input.hex') size = len(sample_in) for i in range(size): print sample_in[i] This would print all the 1454 data available in the file. After writing you can read back to a file from sdram using gxstools. Write the read values to a file read_values.hex Use Upper Address as 1453 and Lower Address 0 when doing the write and read. Now that we have the read values in read_values.hex we can check the values using a python script similar to above. We can compare the data in the two hex files for any mismatches using a simple script as follows, from intelhex import IntelHex sample_in = IntelHex('sample_input.hex') read_out = IntelHex('read_values.hex') test_passed = 0 test_failed = 0 size = len(sample_in) for i in range(size): if sample_in[i] == read_out[i] : test_passed = test_passed + 1 else : test_failed = test_failed + 1 print "Tests Results = Passed: %d Failed: %d" % (test_passed,test_failed) If there are any failed tests then the Controller has failed. My results were, Tests Results = Passed: 1454 Failed: 0 This shows that the controller was able to successfully write and read back 1454 data to and from memory without any errors. #### End of hardware verification... :D ### Yue Liu(pwntools) #### GSOC2015 Students coding Week 12 week sync 16 ## Last week: • Coding for Aarch64. 1. Learning Aarch64 instruction set. 2. Finding gadgets for Aarch64, such as: RET; blr; br. 3. Aarch64 ABI. 4. Aarch64 ROP chain - TODO.. ## Next week: • Fixing potential bugs. • Aarch64 supported. Reference: ## August 13, 2015 ### Vito Gentile(ERAS Project) #### Enhancement of Kinect integration in V-ERAS: Sixth report This is my sixth report on what I have done for my GSoC project. If you don’t know what it is about and want to find more information, please refer to this page and this blog post. During the last two weeks, I have mainly worked on two topic: improving the gesture recognition module by fixing some bugs and adding a script to automatically use the last version of the module, and trying to debug the user’s step estimation algorithm by using ROS. For what about the gesture recognition module, it is still able to detect if the hands are closed or open, but I have not implemented yet the press gesture (which seems to be easily doable anyway). What I have done was to make everything usable for a system manager, which should not focus on compiling and fixing dependencies, but just executing the software. I have written a script which check if the DLLs files are in the right folder and if they are up-to-date. If not, the script copied these files in the right location, so that there are no errors also after compiling the C++ module (which actually implements the gesture recognition, and it is used by ctypes in Python). I have also tried to install ROS Indigo in my Ubuntu virtual machine, but at the end of the story I realized that trying to do it with Ubuntu 14.10 is not a good idea… Actually the ERAS project recommended to use Ubuntu 14.04, but I have used 14.10 until now, so I hoped to continue without reinstall everythin in another virtual machine. Unfortunately, now I know that it is necessary, and this is what I will do. After installing ROS, I will be able to use the Turtle simulator to easily debug the step estimation procedure (which actually seems to work quite well, as you can see from the good quality of this simulation video, made by Siddhant): Now the end of GSoC is approaching, and what I hope to finalize is basically the implementation of press gesture recognition, a review of the body tracker documentation (which is quite acceptable actually) and the implementation of some unit tests. I will update you on my work! Ciao! ### Rafael Neto Henriques(Dipy) #### [RNH post #11] Further improvements on the diffusion standard statistics As I mentioned on my last post, I used the implemented modules to process data acquired with similar parameters to one of the largest world wide project, the Human Connectome project. Considering that I was fitting the diffusion kurtosis model with particularly no pre-processing steps, which are normally required on diffusion kurtosis imaging, kurtosis reconstructions were looking very good (see Figure 2 of my last post). Despite this, some image artefacts were presented, likely being a consequence of gibbs artefacts and MRI noise. In particular, some low intensity voxels were presented in regions where we expect that MK and RK is high. To correct these artefacts, I decide to add a pre-processing step that denoises diffusion-weighted data (to see the coding details of this, see directly on my pull request). Before fitting DKI on the denoised data, this are the amazing kurtosis maps that I obtained:  Figure 1 - Real brain parameter maps of the mean kurtosis (MK), axial kurtosis (AK), and radial kurtosis (RK) obtain from a HCP-like dataset using the DKI module. These are the maps specific to DKI. The dataset for these reconstructions was kindly supplied by Valabregue Romain, CENIR, ICM, Paris. You can also see the standard diffusion measures obtain from my implemented DKI module and compared to the DTI module previously implemented:  Figure 2. Real brain parameter maps of the diffusion fractional anisotropy (FA), mean diffusivity (MD), axial diffusivity (AD), and radial diffusivity (RD) obtain from a HCP-like dataset using the DKI modules (upper panels) and the DTI module (lower panels). Despite DKI involves the estimation of a larger number of parameter, the quality of the diffusion standard measures of the HCP-like dataset from DKI seem be comparable with the standard diffusion measures from DTI. This dataset was kindly supplied by Valabregue Romain, CENIR, ICM, Paris. ### Udara Piumal De Silva(MyHDL) #### Moved to a Flat Directory I work is moved from Controller and Simulator folder to a single folder. After moving small modifications were done to correct path changes. ## August 12, 2015 ### Chad Fulton(Statsmodels) #### Dynamic factors and coincident indices Factor models generally try to find a small number of unobserved "factors" that influence a subtantial portion of the variation in a larger number of observed variables, and they are related to dimension-reduction techniques such as principal components analysis. Dynamic factor models explicitly model the transition dynamics of the unobserved factors, and so are often applied to time-series data. Macroeconomic coincident indices are designed to capture the common component of the "business cycle"; such a component is assumed to simultaneously affect many macroeconomic variables. Although the estimation and use of coincident indices (for example the [Index of Coincident Economic Indicators](http://www.newyorkfed.org/research/regional_economy/coincident_summary.html)) pre-dates dynamic factor models, in several influential papers Stock and Watson (1989, 1991) used a dynamic factor model to provide a theoretical foundation for them. Below, we follow the treatment found in Kim and Nelson (1999), of the Stock and Watson (1991) model, to formulate a dynamic factor model, estimate its parameters via maximum likelihood, and create a coincident index. ### Vivek Jain(pgmpy) #### UAI Reader And Wrirter After mid term, I worked on UAI reader and writer module.Now, it has been successfully merged into the main repository. ### UAI Format Brief Description It uses the simple text file format specified below to describe problem instances Link to the format : UAI A file in the UAI format consists of the following two parts, in that order: 1. Preamble 2. Function Preamble: It starts with a text denoting the type of the network.This is followed by a line containing the number of variables. The next line specifies each variable's domain size, one at a time, separated by whitespace.The fourth line contains only one integer, denoting the number of functions in the problem (conditional probability tables for Bayesian networks, general factors for Markov networks). Then, one function per line, the scope of each function is given as follows: The first integer in each line specifies the size of the function's scope, followed by the actual indexes of the variables in the scope. The order of this list is not restricted, except when specifying a conditional probability table (CPT) in a Bayesian network, where the child variable has to come last. Also note that variables are indexed starting with 0. Example of Preamble MARKOV 3 2 2 3 2 2 0 1 3 0 1 2 In the above example the model is MARKOV and no of variables are 3, and domain size of the variables are 2 2 3 respectively. So for reading the preamble, we have used pyparsing module. And to get the no of variables and their domain sizes we have declared method get_variables and get_domain which will return the list of variables and the dictionary with key as variable name and value as their domain size. For example, for the above preamble the method get_variables will return [var_0, var_1, var_2] and the method get_domain will return {var_0: 2, var_1: 2, var_2: 3} Function: In this section each function is specified by giving its full table (i.e, specifying the function value for each tuple). The order of the functions is identical to the one in which they were introduced in the preamble. For each function table, first the number of entries is given (this should be equal to the product of the domain sizes of the variables in the scope). Then, one by one, separated by whitespace, the values for each assignment to the variables in the function's scope are enumerated. Tuples are implicitly assumed in ascending order, with the last variable in the scope as the 'least significant'. Example of Function 2 0.436 0.564 4 0.128 0.872 0.920 0.080 6 0.210 0.333 0.457 0.811 0.000 0.189 ## August 11, 2015 ### Michael Mueller(Astropy) #### Week 11 This week I implemented nice bit of functionality that Tom suggested, inspired by a similar feature in Pandas: retrieving index information via Table attributes loc and iloc. The idea is to provide a mechanism for row retrieval in between a high-level query() method and dealing with Index objects directly. Here's an example:  In [2]: t = simple_table(10) In [3]: print t a b c --- ---- --- 1 1.0 c 2 2.0 d 3 3.0 e 4 4.0 f 5 5.0 g 6 6.0 h 7 7.0 i 8 8.0 j 9 9.0 k 10 10.0 l In [4]: t.add_index('a') In [5]: t.add_index('b') In [6]: t.loc[4:9] # 'a' is the implicit primary key Out[6]: <Table length=6> a b c int32 float32 str1 ----- ------- ---- 4 4.0 f 5 5.0 g 6 6.0 h 7 7.0 i 8 8.0 j 9 9.0 k In [7]: t.loc['b', 1.5:7.0] Out[7]: <Table length=6> a b c int32 float32 str1 ----- ------- ---- 2 2.0 d 3 3.0 e 4 4.0 f 5 5.0 g 6 6.0 h 7 7.0 i In [8]: t.iloc[2:4] Out[8]: <Table length=2> a b c int32 float32 str1 ----- ------- ---- 3 3.0 e 4 4.0 f  The loc attribute is used for retrieval by column value, while iloc is used for retrieval by position in the sorted order of an index. This involves the designation of a primary key, which for now is just the first index added to the table. Also, indices can now be retrieved by column name(s):  In [9]: t.indices['b'] Out[9]: b rows ---- ---- 1.0 0 2.0 1 3.0 2 4.0 3 5.0 4 6.0 5 7.0 6 8.0 7 9.0 8 10.0 9  Aside from this, I've been adding in miscellaneous changes to the PR, such as getting np.lexsort to work with Time objects, reworking the SortedArray class to use a Table object instead of a list of ndarrays (for working with mixins), putting index_mode in Table, etc. Tom noted some performance issues when working with indices, which I've been working on as well. ### Chienli Ma(Theano) #### Self Criticism In the last week I wasted a lost of time for not turly understand others’ codes. And I wanna make a self-criticism. At the beginning of the tasks OpFromGraph.c_code(), Fred pointed to me a few commits which he thought might be and majorly implemented make_thunk() method. “This is wired, why he provided codes of other method?”, I thought. With only a glance of the codes and some confusions, I turned to CLinker and gof.Op. I thought this might be a simple mission, which wasn’t. The biggest problem was my misunderstand of CLinker – I thought it is something like PerformLinker and VM_Likner, called by orig_func() and linking the whole graph. But the truth is CLinker did not serve the fgraph, but the node. In gof.Op.make_thunk(), if op_use_c_code is true, it will call make_c_thunk() and use CLinker to generate a C code and link the storage. And then return a ret(what’s a ret?) So there’s two equivalence ways to impelemnt c_code(), to make an Op faster. One is the way I took – implementing serious of C code method of Op, so that Op can return C code accords to fgraph. In this way I need to generate C code fitst. And then I need to break apart those code while ensuring they can be compiled after being resembled by CLinker. This require a thorough understand of CLinker. Yes I can only get the basic idea. Therefore I stucked. The other way is override the make_thunk()(or make_c_thunk)method, which is Fred’s way. This is much easier. Because we do not need to seperate the codes, it is generated in a whole and independently. We don’t link the cthunk with storage until it’s generated(really?), which save a lot of problem and make full use of CLinker’s ability. Fred already gave me a workable code. I only need to improve it a bit. But my ignorant lead me into another way.Therefoer I decide to post this blog to remind me that everytime before I take a step, I need to full understand what I’m going to do and how I’m going to accomlish it, as well as suggestion from others. Otherwise, the more effort I make, I more resoureces I waste. Also, ask question when confused. Shame on me this time. ### Shivam Vats(SymPy) #### GSoC Week 11 Sorry for the delayed post! Last week was extremely busy. It’s time to wrap up my work. The good new is that rs_series (I called it series_fast earlier) works well for taylor series. The speedups are impressive and it can handle all sorts of cases (so far!). Now, I need to make it work for laurent and puiseux series. Given that ring_series functions work well for negative and fractional powers, ideally that shouldn’t be difficult. However, my current strategy is to add variables as generators to the currently used ring. The backend of creating rings is in polys, which doesn’t allow negative or fractional powers in the generators (that is the mathematical definition of polynomials). For example: In [276]: sring(a**QQ(2,3)) Out[276]: (Polynomial ring in a**(1/3) over ZZ with lex order, (a**(1/3))**2) In [277]: _[0].gens Out[277]: ((a**(1/3)),)  Contrast this with: In [285]: sring(a**2) Out[285]: (Polynomial ring in a over ZZ with lex order, a**2)  Generators with negative or fractional powers are treated as symbolic atoms and not as some variable raised to some power. So these fractional powers will never simplify with other generators with the same base. The easy way to fix this is to modify sring but that would mean changing the core polys. I am still looking for a better way out. The polynomial wrappers PR had been lying dead for quite some time. It currently uses piranha’s hash_set but it needs to work on unordered_set when piranha is not available. I am adding that here. It is mostly done, except for encode and decode functions. Once the wrappers are in, I can start porting ring_series functions. ### Next Week • Make rs_series work for puiseux series. • Complete polynomial wrappers. • Port the low level ring_series functions. Cheers! GSoC Week 11 was originally published by Shivam Vats at Me on August 11, 2015. ### AMiT Kumar(Sympy) #### GSoC : This week in SymPy #10 & #11 Hi there! It's been 11 weeks into GSoC and we have reached into the last week before the soft deadline. Here is the Progress so far. ### Progress of Week 10 & 11 Last couple of weeks, I worked mainly on the Documentation of the solveset module. It's very important to let others know what we are doing and why we are doing, so this PR #9500 is an effort to accomplish that. Here are some of the important questions, I have tried to answer in the PR #9500 :check: What was the need of a new solvers module? :check: Why do we use sets as an output type? :check: What is this domain argument about? :check: What will you do with the old solve? :check: What are the general design principles behind the development of solveset? :check: What are the general methods employed by solveset to solve an equation? :check: How do we manipulate and return an infinite solutions? :check: How does solveset ensures that it is not returning any wrong solution? There is still some polishing required in this as suggested by @hargup #### Linsolve Docs I completed the documentation PR for linsolve. See PR #9587 #### Differential Calculus Methods I have also started working on the differential calculus methods as mentioned in my proposal here. See diff-cal branch. ### from future import plan Week #12: This week I plan to finish up all the pending work and wrap up the project and get PR #9500 Merged. ### git log

PR #9500 : Documenting solveset

PR #9587 : Add Linsolve Docs

That's all for now, looking forward for week #12. :grinning:

## August 10, 2015

### Mridul Seth(NetworkX)

#### GSOC 2015 PYTHON SOFTWARE FOUNDATION NETWORKX BIWEEKLY REPORT 5

This blog post covers week 9, 10, 11.

Summer almost over, great and fun work :)

iter_refactor has been merged into the master branch and hopefully everything works fine :)

So now all the base class’s methods return an iterator instead of list/dicts. Now I have started working on algorithms. Algorithms and functions that return a list/dict should return an iterator. We have started working on shortest_paths algorithms https://github.com/networkx/networkx/pull/1715.

Related issues https://github.com/networkx/networkx/issues/1632

## August 09, 2015

### Raghav R V(Scikit-learn)

#### GSoC 2015 The New Cross Validation Interface and use

This blog will be about the new interface of the cross-validation module in scikit-learn, which will become available very soon as the work on PR #4294 is nearing completion and hopefully it will get merged soon.

The two main features of the new model_selection module are -
• Grouping together all the classes and functions related to model-selection or evaluation and
• Data independent CV classes which instead of taking data / data-dependent parameters at the time of initialisation, expose a new split(X, y, labels) method which generates the splits based on the chosen strategy.
The second feature is of considerable importance to a lot of people as it enhances the usability of CV objects. An important benefit of the 2nd feature is that nested cross-validation can now be performed easily.

To read more about nested cross-validation, refer to my previous blog post.

This paper, by Cawley et al published in JMLR in the year 2010, also elaborates on the importance of nested cross-validation for model selection.

Let us now work with the diabetes dataset and use the new API to build and evaluate different settings of the SVR using nested cross-validation.

(Incase you are wondering what SVRs are, this article explains the same nicely. Do check it out!)

The diabetes dataset consists of 442 samples consisting of 10 features per sample as the data and the diabetes score as the target. It is a simple regression problem.

Let us hypothetically assume that this data was compiled from 10 patients with multiple samples from each patient at different times. With this assumption let us label the data samples based on the patient id (arbitrarily chosen) ranging from 1 - 10.

    >>> diabetes_dataset = load_diabetes()    >>> X, y = diabetes_dataset['data'], diabetes_dataset['target']    >>> labels = ([1]*50 + [2]*45 + [3]*60 + [4]*10 + [5]*25 + [6]*155                  + [7]*20 + [8]*10 + [9]*20 + [10]*47)

Now this hypothetical assumption has a hypothetical consequence. The sample distribution tends to group/cluster around each patient and has a possibility that any model trained using such a dataset might overfit on those groups (patients) and predict the target well only for those patients whose data was used for training the model.

(To clarify why this is different compared to the regular overfitting problem, any model built on such a dataset could perform well on unseen data from the old patients (whose data was used in training), but perform poorly on unseen data from new patients (whose data was not used for training the model). Hence testing such a model even on the unseen data from old patients could potentially give us a biased estimate of the model's performance.)

So it becomes necessary to evaluate the model by holding out one patient's data and observing how the model trained with the rest of the patient's data generalizes to this held-out patient. (This gives us an unbiased estimate of the model)

This can be easily done by using sklearn's LeaveOneLabelOut cross-validator.

NOTE
1. Nested cross-validation can be illustrated without this slightly convoluted hypothetical example. However to appreciate the real benefit of having data independence in the CV iterator, I felt that it would be better to show how we can flexibly choose different CV strategies for the inner and outer loops even when one of them use labels to generate the splits.
2. Another similar example would be data collected from multiple similar instruments. Here the samples would be labelled with the instrument id.
To perform model selection we must explore the hyperparameter space of our model and choose the one that has the best unbiased score (which means that our model generalizes well outside the training data)

Now let us do a grid search with a range of values for the three important hyperparameters epsilon, C and gamma

(Note that gamma is the rbf kernel's parameter and not a hyperparameter per se)

    >>> epsilon_range = [0.1, 1, 10, 100, 1000]    >>> C_range = [0.1, 1, 10, 100]    >>> gamma_range = np.logspace(-2, 2, 5)    >>> parameter_grid = {'C': C_range, 'gamma': gamma_range, 'epsilon': epsilon_range}

Let us import the LeaveOneLabelOut, KFold, GridSearchCV and cross_val_score from the new model_selection module...

    >>> from sklearn.model_selection import (    ...     GridSearchCV, LeaveOneLabelOut, KFold, cross_val_score)

Let us use LeaveOneLabelOut for the inner CV and construct a GridSearch object with our parameter_grid...

    >>> inner_cv = LeaveOneLabelOut()    >>> grid_search = GridSearchCV(SVR(kernel='rbf'),                                   param_grid=parameter_grid,                                   cv=inner_cv)

And use KFold with n_folds=5 for the outer CV...

    >>> outer_cv = KFold(n_folds=5)

We now use the cross_val_score to estimate the best params for each fold of the outer split and analyse the best models for variance in their scores or parameters. This gives us a picture of how much we can trust the best model(s)

    >>> cross_val_score(    ...     grid_search, X=X, y=y,    ...     fit_params={'labels':labels},    ...     cv=outer_cv)    array([ 0.40955022, 0.55578469, 0.4796581 , 0.43532192, 0.55993554])

Good so the scores seem to be similar-ish with a variance of < +/- 0.1

Let us do a little more analysis to know what the best parameters are at each fold... This allows us to check if there is any variance in the model parameters between the different folds...

>>> for i, (tuning_set, validation_set) in enumerate(outer_cv.split(X, y)):    ... X_tuning_set, y_tuning_set, labels_tuning_set = (    ... X[tuning_set], y[tuning_set], labels[tuning_set])    ...    ... grid_search.fit(X_tuning_set, y_tuning_set, labels_tuning_set)    ...    ... print("The best params for fold %d are %s,"    ...       " the best inner CV score is %s,"    ...       " The final validation score for the best model "    ...       "of this fold is %s\n"    ...       % (i+1, grid_search.best_params_, grid_search.best_score_,    ...          grid_search.score(X[validation_set],    ...                            y[validation_set])))
The best params for fold 1 are {'epsilon': 10, 'C': 100, 'gamma': 1.0}, the best inner CV score is 0.446247290221, The final validation score for the best model of this fold is 0.409550217773    The best params for fold 2 are {'epsilon': 10, 'C': 100, 'gamma': 10.0}, the best inner CV score is 0.454263206641, The final validation score for the best model of this fold is 0.555784686655    The best params for fold 3 are {'epsilon': 0.1, 'C': 100, 'gamma': 10.0}, the best inner CV score is 0.456009154539, The final validation score for the best model of this fold is 0.479658102225    The best params for fold 4 are {'epsilon': 10, 'C': 100, 'gamma': 10.0}, the best inner CV score is 0.465195173573, The final validation score for the best model of this fold is 0.43532192105    The best params for fold 5 are {'epsilon': 0.1, 'C': 100, 'gamma': 10.0}, the best inner CV score is 0.44172440761, The final validation score for the best model of this fold is 0.559935541672

So the 5 best models are similar and hence we choose values epsilon as 10, gamma as 10 and C as 100 for our final model.

The main thing to note here is how the new API makes it easy to pass inner_cv to GridSearchCV making it really simple to perform nested CV using two different types of CV strategies in just 2 lines of code.


    >>> grid_search = GridSearchCV(SVR(kernel='rbf'),    ...                            param_grid=parameter_grid,    ...                            cv=LeaveOneLabelOut())
    >>> cross_val_score(    ...     grid_search, X=X, y=y,    ...     fit_params={'labels':labels},    ...     cv=KFold(n_folds=5))    
    array([ 0.40955022, 0.55578469, 0.4796581 , 0.43532192, 0.55993554]) 

EDIT (30th October 2015): The model_selection module has been merged along with the documentation!

### Zubin Mithra(pwntools)

I just added in support for PPC-srop. You can see the pull request here. The doctest is skipped as the current version of qemu-user segfaults when the test is run. If you try debugging the integration test at line 162 as follows, you can see that the sigreturn system call works just fine, however, after returning from a do_syscall, qemu fails.

The files used for testing can be found here. The sigreturn call works as expected when run on a qemu-system-powerpc build.

### Aman Jhunjhunwala(Astropy)

#### GSOC ’15 Post 5 : Final Lap

In about 2 weeks , we finally end what has been a fun and exhausting Summer of Code,2015 ! It has been a great journey and now its time to pull it across the finish line …

We started our “Preview Phase” 15 days back  , and we have received a great response from the community ! There has been productive feedback all across the board and we have done our level best to integrate most of them. Summarizing a few of the feedback and the solutions for the week:-

1. Design Related issues : I was expecting a few negative reviews for the CSS because I knew that using 2 light colors (white and subdued yellow) was always a risk.Though some of the reviews were positive, a majority felt that the CSS needed changes. The complaints were :
a) The contrast was low : Depending on the resolution of the screen , the yellow was wavering from brownish tone (on high-def screens) to light lemon yellow (on old CRT or low def screens)
b) The font was too thin and difficult to read
c) The text on the homepage was appearing on a single wide line , so one has to move his head all the way across to read something.
d)The actual posts were being dwarfed by their surroundings . There were too many elements at once on a Single Post Display Page.
e) The design is too “airy” . Lots of unused spaces.
f) Logo should be more bold.
h) One review for and one review against the planets right up front.
Solutions applied

If you browse the homepage again, you would see that a lot of the main page components have been designed.
• The color used now is Dark Yellowish Brown. I guess after hours of color-hopping we have landed with our ideal color. This closed all contrast related issues(a).
• I changed the font to a more darker and richer font. Instead of using one font for the entire website, we are now using 5 different fonts . Hopefully the fonts are complementing each other and the design ! (and is easy to read ) . This closed (b)
• The new homepage displays posts in small rectangular area , making it easier to read and allowing us for a more compact and information rich homepage. Now 8 posts from Forum, 8 from Teach and Learn and 4 from Packages appear on our Homepage. This closed (c) and (g)
• The single post view was completely redesigned to make it less crowded and also give a better look to smaller posts.Comments , Ratings and Sharing Buttons have now been clubbed into 3 sections under a common container. The sidebar was removed as well. Head over here to view the changes. This closed (d)
• The logo was redesigned a bit , adding more weight and a uniform color ,closing (f)
•  Padding and margins were adjusted throughout the web application, to make the look less “airy” , closing (e)
• The planets were replaced by a carousel that hosts “Pic of the Day” grabbed from astronomy.org. I have tried to make the backend simple yet powerful !  Anyone can upload images of any resolution and it will automatically stretch it to fit the carousel… with the text on top self adjusting !
2.Speed / Load Time : The web app was around 4 Mb in size and took 5-6 seconds to load completely ,when we launched the preview phase. This wasn’t taken well by the users. Gabriel suggested that any page  loading over 2 secs would be unpleasing. Since then , I have compressed most resources, adjusted the JS and CSS calls,implemented cache mechanisms,redesigned layouts and brought down the loading time from 6 secs to 2.2 seconds and size from 4 Mb to 890 kb.
I was trying to implement “Memcache” caching mechanism(the fastest cache available today)  for a good 2 days, when I found out that PythonAnywhere currently has rolled back Memcache support.
However scope still remains to decrease loading times further and I will be doing so in the coming week.
3. Separate Login for using Disqus : Users initially required to login to our own website and then had to again login in to disqus to post a comment. This was annoying. Disqus does have a mechanism called “Single Sign On” to allow users to login automatically using the site’s DB – But that required elevated Disqus privileges. I received the privileges from the Disqus support yesterday and the process to implement it looks pretty daunting and complex. I will be trying to implement that this week.
4. Adding a tag cloud feature. I have been thinking for a speed-effective solution to this problem for weeks now and I have found a solution. I plan to store all the tags of a section in a separate JSON file along with the number of times a tag appears. The file is updated periodically (once a day or something) and the counters are updated . Everytime we wish to create a tag cloud , it will read in directly from the JSON files and display the tag cloud. I have written the code for this but am facing some display issues (CSS) so I haven’t uploaded it yet !
5. Check for code compliance using some testing application over the web.
6. Other changes :
• Added an option to connect multiple social accounts to one user (Eg. If the user has an FB account and a GMail account with different email addresses, he can link them and the next time , any OAuth he uses , he gets back to the same account)
• Changed the algorithm for the unique token in “Anonymous Voting” . There was a flaw in the previous implementation. Resolved.
• Lots of minor stuff here and there!
Besides these feedback related issues I have added 2 new sections :-
1. A Live Group Chat Section under the heading “Chat” on the Navbar. After trying out a dozen chat hosts (Gitter, HipChat,CromaChat,etc) , I circled in on the current provider. Its a DEV section and will be removed if it is not useful.People could come there and ask for some info or help or anything !
2.  Shifted the TrinketIO block from front page to a new page , accessible from the Footer (/try/python)
For the last lap, we will close down final issues and begin final deployment onto our Production Server at astropython.org .  Hopefully , things will tie themselves up at the end and we end the Summer of Code on a high ! Zooming in……Meet you at the other side of the finish line !

### Udara Piumal De Silva(MyHDL)

#### Hardware verification Completed..!!

After several modifications I was able to integrate my controller to xstools so that gxstools uses my controller to read and write to sdram. Below is the RTL view after integrating my controller

Initially the read value did not match the written value. After some fixing compare results showed that read values are exactly maching to the written values. This verifies that the controller actually works in the expected behaviour.

## August 08, 2015

### Christof Angermueller(Theano)

#### GSoC: Week eight and nine

Theano OpFromGraph ops allow to define new operations that can be called with different inputs at different places in the compute graph. I extended my implementation to compactly visualize OpFromGraph ops: By default, an OpFromGraph op is represented as a single node. Clicking on it will reveal its internal graph structure. Have a look at this example!

OpFromGraph ops may be composed of further OpFromGraph nodes, which will be visualized as nested graphs as you can see in this example.

In the last stage of GSoC 2015, I will improve how nodes are arranged in the visualization, shorten node labels, and show more detailed information about nodes such as their definition in the source code!

The post GSoC: Week eight and nine appeared first on Christof Angermueller.

### Lucas van Dijk(VisPy)

#### GSoC 2015: First sightings of a graph!

Hi all, and welcome to another progress report on my Google Summer of Code project!

First of all, my pull request for the ArrowVisual is now merged! This finalizes my work on arrow heads for VisPy for now. Having completed this, I could finally move on to the next part of the project, the actual GraphVisual.

And I've made quite some progress already on the GraphVisual: it is already to possible to visualise a graph with VisPy! There are already a few simple automatic layout algorithms implemented, including random positions, all nodes on a circle, and a force directed layout based on the Fruchterman-Reingold algorithm. However, the latter still needs some performance improvements.

The nice thing about the current layout API is that any layout algorithm has the possibility to yield intermediate results, allowing us the animate the calculation of the new layout of the graph.

The current status can be seen in the pictures below:

The pull request for the GraphVisual can be found here: https://github.com/vispy/vispy/pull/1043

### Rafael Neto Henriques(Dipy)

#### [RNH post #10] Progress Report (7th of August)

We are almost getting to the end of the GSoC coding period =(.

The good news is that progress is still going at full speed!!! I finalized the work on the standard kurtosis statistics estimation, and great progress was done on the white matter fiber direction estimates from diffusion kurtosis imaging (DKI). Details can be found below!

## Implementation of DKI statistics is now complete!!!

As I planed in my previous post, in the last couple of weeks, I created a sample usage script for the DKI statistic estimation module using data acquired with similar parameters to the Human Connectome Project. Figures, for both diffusion and kurtosis standard statistics are looking very good (see below) and these are great news. These results show that my implemented module can be used on the analysis of one of the largest world wide projects which aims to map the human brain connections.
 Figure 1. Real brain parameter maps of the diffusion fractional anisotropy (FA), mean diffusivity (MD), axial diffusivity (AD), and radial diffusivity (RD) obtain from a HCP-like dataset using the DKI modules (upper panels) and the DTI module (lower panels). Despite DKI involves the estimation of a larger number of parameter, the quality of the diffusion standard measures of the HCP-like dataset from DKI seem be comparable with the standard diffusion measures from DTI.  This dataset was kindly supplied by Valabregue Romain, CENIR, ICM, Paris.

 Figure 2 -  Real brain parameter maps of the mean kurtosis (MK), axial kurtosis (AK), and radial kurtosis (RK) obtain from a HCP-like dataset using the DKI module. These are the maps specific to DKI. The dataset for these reconstructions was kindly supplied by Valabregue Romain, CENIR, ICM, Paris.

I also dramatically improve the speed performance of the kurtosis statistics estimation modules! In my previous post, I mentioned that I had optimized the codes in the way that all three standard kurtosis statistics are processed within 30 min. Now all three standard kurtosis statistics can be computed within 1 min. Reprocessing the kurtosis measures shown in Figure 1 of my post #6 is now taking:

• Mean kurtosis - 32 sec (before 14 mins)
• Radial kurtosis - 12 sec (before 7 mins)
• Axial kurtosis - 42 sec (before 1 min)

## Advances on the DKI based fiber direction estimates

Based on the DKI-ODF described in my previous post, a procedure to extract the fiber direction estimates was implemented. This was done using the quasi-Newton algorithms available on Scipy's optimization module. For an example of the fiber direction estimates using the implemented procedure, we show above the estimates obtain from real brain voxels of the corpus callosum:

 Figure 3 - Sagittal view of the direction estimates of  horizontal corpus callosum fibers obtain from the DKI-ODF.

## Last steps for the Google summer of code 2015

• The work on the DKI based fiber direction estimates will be finalized by making the fiber estimates compatible to the tractography methods already implemented in Dipy. In this way, I will be able to reproduce the first DKI based tractography in HCP-like data.
• With the procedures to estimate the standard kurtosis statistics and DKI based fiber estimates, I will finish the work proposed on my proposal by implementing some novel DKI measures which can be related to concrete biophysical parameters.

### Udara Piumal De Silva(MyHDL)

#### Hardware verification started ...

I have started hardware verification of the design. This is done by writing some test data to sdram and read it back from the sdram.

First I had to install gxstools which is a gui tool written in python which allows to interact with Xula2 board. Initially I faced some issues where it crashes issuing an exception. As a workaround I disabled threads in testing and sdram read/write so I can use gxstools without crashing in the middle.

Then I tried writing XuLA_jtag.hex to the SDRAM. But then I got a console message as follows,
intelhex.NotEnoughDataError: Bad access at 0x800: not enough data to read 3143694 contiguous bytes

I created a different hex file using the data given on the file intelhex/test.py
I got success message for writing. To verify the data has been written and read from the sdram, I used Upper address 31 and Lower address 4 and did a write from test.hex and read the value back to a file output.hex.

I compared this too hex files using a simple python script :

from intelhex import IntelHex

in   = IntelHex('home/udara/test.hex')
out = IntelHex('home/udara/output.hex')

for i in range(4,31) :
print 'input => ',in[i], 'output => ',out[h]

The output showed that all the data from 4 to 31 is equal in the two files.

Now that I can use the existing source to read and write hex files to sdram. I will try to use my design in the test which will let me know whether my controller can perform read and write without errors.

## August 07, 2015

### Mark Wronkiewicz(MNE-Python)

#### Fine calibration: one of many SSS improvements

SSS itself implemented, I’m now trying to process a back-log of improvements that were made since the algorithm’s initial publication in the mid-2000s. There are four or five of these modifications, some of which will boost noise rejection by an order of magnitude or more. The first set of three improvements goes under the umbrella term “fine calibration.”

The SSS algorithm depends heavily on the location and geometry of the MEG sensors. Therefore, it’s not surprising that any small error in the believed location or behavior these sensors will introduce a significant error in the filter’s output. Fine calibration consists of three modifications to correct for these sensor inconsistencies. For all of these improvements, we record empty room data and construct a “fine calibration” file. The first fix updates the orientation of each sensor coil. Because the sensor coils pickup the magnetic flux through their coil loops, a more accurate estimate of the true orientation will yield more accurate representation in the multipolar moment space. The second fix concerns the gradiometers only. Again, there are small imperfections in the MEG coils, and gradiometers measure small signal differences between pairs of loops. If one gradiometer loop has any physical differences from its twin, a substantial error will be introduced into the recorded signal. Therefore, we simulate small point-like magnetometers at the center of each gradiometer to account for this gradiometer “imbalance.” The third and final fix is concerned with imperfections in the magnetometers. Again, we’re dealing with physical devices, so we measure if any of these sensors have readings that are slightly too high or low in amplitude and correct for this with a calibration coefficient from that same fine calibration file. This final improvement has a relatively small effect compared to the first two.

I’ve finished the code for the fine calibration implementation, but the filtered results aren’t a close enough match with the proprietary code just yet.  On the bright side, the undiscovered bug not causing the filter to completely fail. Once I find the issue, I’ll be on a sprint to implement temporal SSS before the end of summer!

### Manuel Paz Arribas(Astropy)

#### Progress report

My work on the last 2 weeks has been mainly focused in the Gammapy tools to select observations from an observation list that were introduced in the mid-term summary report, and on the script to produce cube background models presented in the last progress report. The progress on these 2 topics is presented in more detail in the following sections.

In addition, I also contributed to some of the clean-up tasks in order to prepare for the release of the Gammapy 0.3 stable version in the coming weeks.

## Observation selection

I restructured the code in that I produced a few weeks ago to make it clearer and defined 3 main observation selection criteria:
• Sky regions: these methods select observations on a certain region of the sky, defined as either a square (sky_box), or a circle (sky_circle).
• Time intervals: this method selects observations in a specific time range, defined by its minimum and maximum values.
• Generic parameter intervals: this method selects observations in a (min, max) range on a user-specified variable present in the input observation list. The only requirement for the variable is that it should be castable into an Astropy Quantity object: the variable should represent either a dimensionless quantity (like the observation ID), or a simple quantity with units (like the altitude angle or the livetime of the observations).
More details are given in the documentation I wrote for the select_observations function.

In addition I produced a working inline-command tool to act on an input observation list file and output a selected observations file. This tool can perform the same kind of selections mentioned above in a recursive way. More details are given in the documentation I wrote for the gammapy-find-obs tool, that uses the find-obs script.

In order to test the gammapy-find-obs tool, a dummy observation list file produced with the make_test_observation_table tool presented in the first report has been placed in the gammapy-extra repository here.

## Cube background model production

I produced a working version of the script that produces cubes similar to the one presented in the animation on the last post with many improvements.

The new version of the script uses a finer binning both in altitude/azimuth angles for the observation grouping, and in (X, Y, energy) coordinates for the cube itself. The binning on (X, Y, energy) also depends on the amount of statistics (coarser coordinate binning for observation groups with less statistics)

In addition, a smoothing to avoid Poisson noise due to low statistics is applied to the background cube model. The smoothing also depends on the available statistics: cubes with more statistics are smoothed less than cubes with less statistics. The smoothing applied is quite simple. It is performed by convoluting the 2D image of each energy bin of the cube independently with a certain kernel, and scale the smoothed image to preserve the original integral of the image. The kernel chosen is the default kernel used by the ROOT TH2::Smooth function: it is named k5a and acts on 2 layers of surrounding pixels (i.e. 2 next neighbors). Interpreting images as 2D arrays, the kernel is represented by the following matrix:
0 0 1 0 0
0 2 2 2 0
1 2 5 2 1
0 2 2 2 0
0 0 1 0 0

I applied the script to the H.E.S.S. data to produce background models for a full set of altitude/azimuth angle bins with satisfactory results. Unfortunately, since the data is not public, I am not allowed to post any figures or animations showing them.

The current version of the script is functional but still a bit chaotic, because it consists of only a few long functions. Therefore, the next steps, to be accomplished in the remaining time of the Google Summer of Code, is to integrate the script into Gammapy, refactorize the code and move most of its functionality into methods and classes within Gammapy.

### Wei Xue(Scikit-learn)

#### Progress Report 3

My mentor gave me some useful advices after I finished all the codes of BayesianGaussianMixture and DirichletProcessGaussianMixture. So in these two weeks, I fixed the style problems and did all the necessary test cases for BayesianGaussianMixture. I also did the visualization of Gaussian mixture with variational inference for four types of precision using matplotlib.animation, link

Next step, I will explore some optional tasks which are incremental learning and other covariance estimators besides the test cases of DirichletProcessGaussianMixture.

### Siddhant Shrivastava(ERAS Project)

#### Fine-tuning Telerobotics

Hi! As discussed in the previous week, I have been able to get the integration of Telerobotics and Bodytracking up and running. Huge Victory :) Let me say the same thing in a much bolder typeface -

## Integration Successful!

The following screenshot demonstrates what I'm talking about -

## Screen recording - YouTube Video

I used the same tool for screen capturing this integration that I used for real-time streaming from a 3-D camera. The output is as follows -

If my blogging platform is unable to embed the video on the page, you could use this link to watch the first version of Telerobotics and Bodytracking integration. The Visual Tracker designed by Vito looks like this -

## Current Status

It is evident from the video that the setup is functional but not efficient. Moreover, it is buggy. The velocity values are way off the mark that ROS can take which results in jerks in Husky's motion. Also there is a disparity between the refresh rates of ROS and Tango-Controls which is identified by the Device not being unavailable intermittently.

I strongly hope I'll be able to solve these issues in the next post. Of all the aha moments that I have been privy to, watching the Integration working was probably the biggest one of them all. It looks futuristic to me. With the Internet of Everything, a lot of things are going to use Teleoperation. I am so glad that we at the Italian Mars Society are gauging the future trends and experimenting with them in the present. I am honored to be facilitate that experiment.

My next post is surely going to be a much more exciting run-down on how Telerobotics progresses :)

Stay Tuned. Ciao!

### Jakob de Maeyer(ScrapingHub)

Things are going smoothly for my Summer of Code. As the add-on system changed its orientation from a focus on user-friendliness more into the direction of being precise and fitting into the existing framework during the summer, I have scratched the stretch goal of a command line helper tool . Instead, I will try to integrate spider callbacks into the add-on system. I.e., spiders will be able to implement the add-on interface as well, and be called back to update settings or to check the final configuration.

### The base Addon class

As I said, things are just humming along right now, which means I don’t really have too much to blog about here. So I will use this blog post to introduce another feature of my PR: A base Addon class that developers can (but don’t have to) use to ease some common tasks of add-ons, such as inserting a single component into the settings or exporting some configuration. Again, I’m hoping that I can reuse some of this for Scrapy’s docs.

The Addon base class provides three convenience methods:

• basic settings can be exported via export_basics(),
• a single component (e.g. an item pipeline or a downloader middleware) can be inserted into Scrapy’s settings via export_component()
• the add-on configuration can be exposed into Scrapy’s settings via export_config()

By default, the base add-on class will expose the add-on configuration into Scrapy’s settings namespace, in caps and with the add-on name prepended. It is easy to write your own functionality while still being able to use the convenience functions by overwriting update_settings().

Each of the three methods can be configured via some class attributes:

### Exporting basic settings via export_basics()

The class attribute basic_settings is a dictionary of settings that will be exported with addon priority.

### Inserting a single component via export_component()

• The component to be exported is read from the component class attribute. It can be either a path to a component or a Python object.
• The type of the component is read from the component_type class attribute. It should match the name of the setting associated with that component type, e.g. ITEM_PIPELINES or DOWNLOADER_MIDDLEWARES.
• The order of the component is read from component_order. This setting only applies to ordered components, e.g. item pipelines or middlewares.
• The key of the component is read from component_key. This only applies to unordered components such as download handlers.

### Exposing the add-on configuration into Scrapy’s settings

• The prefix to be used for the global settings is read from settings_prefix. If that attribute is None, the add-on name will be used.
• The default configuration will be read from default_config.
• Specific setting name mappings for single configuration entries can be set in the config_mapping dictionary.

### Richard Plangger(PyPy)

#### GSoC: Vec, the little brother of NumPy array

Back from Bilbao I have been busy testing and hunting bugs. This took me quite some time to resolve all issues, because before that I did not run all tests regularly.

I have been working since on vectorizing "user code". The primary goal of this project is to speed up trace loops that iterate over NumPy arrays. But As I have said in the last post it might (or might not) make sense to optimize traces found in the user program.

### Vec, the little brother of NumPy array

Using the following snippet one can build a class for vectors in Python and let the optimization speed up the computation.

import array
class Vec(object):
# ...
def __init__(self, content, type='d'):
self.size = len(content)
self.type = type
self.array = array.array(type, content)

# ...
# Ensure that other is the right type and size,
# out must be allocated if it is None
# ...
i = 0

#
# execute pypy with --jit vectorize_user=1 to
# enable the optimization
#

while i < self.size:
out.array[i] = self.array[i] + other.array[i]
i += 1

# ...

After tracing the loop in the add function a slightly better vector loop is generated. Let's run the program:

# jit warmup
a,b,c = vec(...), vec(...), vec(...)
# start time.time()
for i in range(500):
c = a * b
a = c - b
# stop time.time()

Has the following results can be optained (after the JIT has warmed up):
PyPy (vecopt):    ~0.005
PyPy (no vecopt): ~0.008
CPython:          ~0.040

The PyPy has higer variance in the execution. The garbage collector might be the reason for that. The program has been run 5 times and the mean value is shown above.

Honestly, I'm unsure if there is a real benefit. Since PyPy stores integers/floats arrays (that are fully homogenous) without the overhead of embedding it in a PyObject, SIMD operations could be used for normal Python lists.

The problem with this optimization is that the program must run for a very long time and spend a significant fraction of time in the trace loop that has been optimized. The short evaluation above shows that there might be potential. I will further investigate, because this is a great way to find bugs in the implementation as well.

Traces emitted by the user program are much more complex than the one in the NumPy library. The last week I have been working I found many edge cases and even reminded my that I have left some TODOs in the source code.

### Abraham de Jesus Escalante Avalos(SciPy)

#### Progress Report

Hello all,

The GSoC is almost over. It's been a great experience so far and if you've been following my blog you know that I have decided to continue my involvement with the community, so this is only getting started.

With that in mind and some support from my mentor (Ralf Gommers), some tasks have taken a backseat while others have gone beyond the original intended scope. Most notoriously, the NaN policy which started out as a side note to a simple issue and has become the single largest effort in the project, not just in lines of code but also in community involvement (you can follow the discussion here or the PR here).

NaN policy is now in bike-shedding phase (reaching consensus on keyword and option names) but it is only the start of a long term effort that is likely to span for months (maybe years, depending on pandas and Numpy).

The NIST test cases for one way analysis of variance (ANOVA) are also coming along nicely and once they are done I will continue with the NIST test cases for linear regression.

Right now there are no major roadblocks but it is worth mentioning that Ralf and I have agreed to move the pencils down date to Aug 18th. This is due to the fact that I have to move to Canada soon to begin my master's degree, and this way I can travel to Toronto on Aug 20th to look for a place to live and also spend some quality time on vacation with my girlfriend Hélène, who has been a great support for me during this transition in my life. I feel like she has earned it just as much as I have.

Classes begin on Sept 16th. Once I feel like I'm settled into the new rhythm, I will get back to work picking up on loose ends or side tasks (like extending NaN policy's coverage) so the project will not suffer. I would also seek Ralf's guidance to start integrating myself into the pandas, numpy and possibly scikit-learn communities because I plan to steer my career towards data science, machine learning and that sort of stuff.

I will need to figure out where my motivation takes me, but this is a challenge that makes me feel excited about the future. GSoC may be almost done, but for me this is only just beginning and I could not be happier. As always, thank you for taking the time to read about my life.

Until next time,
Abraham.

## August 06, 2015

### Stefan Richthofer(Jython)

#### JyNI (final) status update

Because I was accepted as a speaker for the JVM language summit (about the JyNI-project, see http://openjdk.java.net/projects/mlvm/jvmlangsummit/agenda.html), I had to finnish my GSOC-project a bit earlier, i.e. already today. I took the last two weeks off to work full time on the project and Garbage Collection is finally working as proposed. Unfortunately I could not investigate the optional goal of ctypes support, but I will follow up on this as soon as possible.

## Modifications to Jython

To make this work, some modifications to Jython were required.

### New GC-flags

1) FORCE_DELAYED_FINALIZATION
If activated, all Jython-style finalizers are not directly processed, but cached for a moment and processed right after the ordinary finalization process.

2) FORCE_DELAYED_WEAKREF_CALLBACKS
Rather similar to FORCE_DELAYED_FINALIZATION, but delays callbacks of Jython-style weak references.

JyNI always activates FORCE_DELAYED_WEAKREF_CALLBACKS and if there are native objects that can potentially cause a PyObject resurrection (JyNI-GC needs this sometimes), JyNI also activates FORCE_DELAYED_FINALIZATION.

FORCE_DELAYED_WEAKREF_CALLBACKS allows to restore weak references pointing to the resurrected object. This is done in a thread-safe manner and if someone calls a weak-refs get-method while the weakref is in a pending state it blocks until it was restored or finally cleared.

FORCE_DELAYED_FINALIZATION allows JyNI to prevent Jython-style finalizers from running, in case their objects were resurrected subsequently to an object-resurrection by JyNI.

This way the object-resurrection can be performed without any notable impact on Jython-level. (Raw Java-weak references would still break and also Java-finalizers would run too early, which is why PyObjects must not implement raw Java-finalizers.)

### Empty PyTuple

When working with Jython I took the opportunity and unified Py.EmptyTuple and PyTuple.EMPTY_TUPLE. These were two singleton-constants for the same purpose. JyNI also has a native counterpart constant for empty tuples, but until now it was not clear to which of the named Jython constants it should be bound.

JyNI's dependence on these features implies that JyNI requires Jython >2.7.0 from now on. I aim to sync JyNI 2.7-alpha3-release with Jython 2.7.1, so that JyNI's >2.7.0-requirement is fulfillable.

## Garbage collection

The more advanced GC-behavior is tested in test_JyNI_gc.py.
test_gc_list_modify_update demonstrates the case where the native reference graph is modified by an object that properly reports the modification to JyNI.
To test the edgy case of gc with silently modified native reference graph I added the listSetIndex-method to DemoExtension. This method modifies native reference graph without reporting it to JyNI. test_gc_list_modify_silent verifies that JyNI properly detects this issue and performs resurrection as needed.
Further it tests that Jython-style weak references that point to the resurrected object stay valid.

## JyNI-support for weak references

Support for weak references is implemented, but not yet fully stable. Tests are needed and more debugging work. For now the code is included in JyNI, but not active. I will follow up on this -like on ctypes- as soon as possible.

### Abhijeet Kislay(pgmpy)

#### Last Progress Report

I have successfully implemented the triplet clustering method from the 2nd paper. I am able to follow Abinash’s comments this time. He also appears happy with my coding so far. The results are all cool :) They are 99% accurate if the convergence is fast. I am checking accuracy by comparing it against the MPLP […]

### Andres Vargas Gonzalez(Kivy)

#### Corner detection on Strokes and Strokes annotations on Matplotlib

Following my proposal for the Google Summer of Code 2015, an algorithm to detect corners on strokes was implemented and it is going to be polished and tested with a template matching classifier. This implementation follows Wolin approach in his paper: “ShortStraw: A Simple and Effective Corner Finder for Polylines” which can be downloaded from: http://faculty.cse.tamu.edu/hammond/publications/pdf/2008WolinSBIM.pdf

A brief explanation of the algorithm consists on sampling each one of the points in order to have equal distance between them. Then a window of +/-3 points is used and the average or mean of the distances is calculated. The distance between the initial point and the last point of the window form a straw. The straw is smaller in the cases that the window of points contain a corner.

From the figure can be seen that pretty much corners are being detected. However a post processing still needs to be performed. In addition, a stroke canvas behavior was added to a figure canvas widget so annotations can be done on top of the graphs as can be seen in the figure below. The yellow background and the blue dots are just debugging information.

### Julio Ernesto Villalon Reina(Dipy)

Hi all,

These last two weeks have been very busy, mainly with finalizing the testing of the code. There have been unforeseen road-blocks during the testing, especially regarding the handling of singularities. It hasn’t been easy, since these have to comply with the desired segmentation task, especially when taking into account the various input images that one can use. Below is an example of the test for a grey-scale image such as a T1 image of the brain:

def test_greyscale_iter():

com = ConstantObservationModel()
icm = IteratedConditionalModes()

mu, sigma = com.initialize_param_uniform(image, nclasses)
sigmasq = sigma ** 2
neglogl = com.negloglikelihood(image, mu, sigmasq, nclasses)
initial_segmentation = icm.initialize_maximum_likelihood(neglogl)
npt.assert_equal(initial_segmentation.max(), nclasses - 1)
npt.assert_equal(initial_segmentation.min(), 0)

mu, sigma, sigmasq = com.seg_stats(image, initial_segmentation, nclasses)
npt.assert_equal(mu.all() >= 0, True)
npt.assert_equal(sigmasq.all() >= 0, True)

final_segmentation = np.empty_like(image)
seg_init = initial_segmentation.copy()

for i in range(max_iter):

print('iteration: ', i)

PLN = com.prob_neighborhood(image, initial_segmentation, beta,
nclasses)
npt.assert_equal(PLN.all() >= 0.0, True)
PLY = com.prob_image(image, nclasses, mu, sigmasq, PLN)
npt.assert_equal(PLY.all() >= 0.0, True)

mu_upd, sigmasq_upd = com.update_param(image, PLY, mu, nclasses)
npt.assert_equal(mu_upd.all() >= 0.0, True)
npt.assert_equal(sigmasq_upd.all() >= 0.0, True)
negll = com.negloglikelihood(image, mu_upd, sigmasq_upd, nclasses)
npt.assert_equal(negll.all() >= 0.0, True)
plt.figure()
plt.imshow(negll[..., 1, 0])
plt.colorbar()
final_segmentation, energy = icm.icm_ising(negll, beta,
initial_segmentation)

initial_segmentation = final_segmentation.copy()
mu = mu_upd.copy()
sigmasq = sigmasq_upd.copy()

difference_map = np.abs(seg_init - final_segmentation)
npt.assert_equal(np.abs(np.sum(difference_map)) != 0, True)

return seg_init, final_segmentation, PLY

Basically, here I am testing for making sure that the input is the right one, that the output of the functions calculating the probabilities (PLN and PLY) are within 0 and 1, and making sure that the parameters (means and variances) are being updated accordingly.  At the end I make sure that the output final segmentation is different from the initial segmentation that is the input to the Expectation Maximization (EM) algorithm embedded in the for loop. As can be seen here within the loop, the EM algorithm alternates with the ICM segmentation method (Iterated Conditional Modes) and in each loop the parameters (means and variances) get updated. Right now, the algorithm is performing well up to a certain amount of iterations. As said, this is due to singularity handling. I will test the algorithm it in the next following days in order to make it as robust as possible. I should be starting the validation of the segmentation on Friday. Will give a heads-up on how the validation goes. If the results are as expected I will be ready to try the algorithm not only on T1 images of the brain but also on diffusion-derived scalar maps such as the “Power-maps”.

### Aron Barreira Bordin(Kivy)

#### Progress Report 2

Hi!

I have advanced in my proposal and extra items, and as we are getting closer to the end of the project, let's see how it's going, and my next steps until the end.

### Usual Python/Garden modules on Buildozer spec editor

When writing your application, it's usual to forgot or don't even know the name of a package dependency. So I have added a list of usual Python and Garden modules to the Buildozer Spec Editor.

### Action Item description

By default, Kivy action button doesn't support a description to each item. So I created a custom one, the in the future, will be used to display action shortcuts.

### New Status Bar

I created a new Status bar to Kivy Designer. The old statusbar was able to display only a item once, and was not working with small screens.

With the new one, we have three different regions to display information.

### Kivy Designer Tools

I added some tools to help with the project development:

#### Export .png

A helper so create a .png image from the application UI or from the selected widget on Playground

#### Check pep8

A simple shortcut to run a pep8 checker in the project under development.

#### Create .gitignore

Creates a simple .gitignore to kivy designer projects

### Kivy Modules

I added support to some Kivy modules with KD. When running your project, you can select and use the following modules:

• touchring
• monitor
• screen - where you can set and emulate different screen sizes and dimensions
• inspector
• webdebugger

### Git/Github integration

Using GitPython, Kivy Designer now supports git repositories.

You can start a new repo, commit, add files, pull/push data remotely, check diffs and switch/create branches :)

### Find

A simple tool to find text or regex in source code.

### Bug fixes

I had pushed a list of bug fixes related with kivy console, project loader and a problem with small screens and acitongroup

## August 05, 2015

### Sahil Shekhawat(PyDy)

#### GSoC Week 11

Hi everyone! this week was very productive. I learned a lot and finished many things. My vacations are now over and have to go to college but I am able to work around 6 hours on week days and cover up the time on weekends (Wednesday is off too).

### Michael Mueller(Astropy)

#### Week 10

This week wasn't terribly eventful; I spent time documenting code, expanding tests, etc. for the pull request. Docstrings are now in numpydoc format, and I fixed a few bugs including one that Tom noticed when taking a slice of a slice:

from astropy import table
from astropy.table import table_helpers

t = table_helpers.simple_table(10)
t2 = t[1:]
t3 = t2[1:]

print(t3.indices[0])

The former output was "Index slice (2, 10, 2) of [[ 1 2 3 4 5 6 7 8 9 10], [0 1 2 3 4 5 6 7 8 9]]" while now the step size is 1, as it should be. The SlicedIndex system seems to be working fine otherwise, except for a python3 bug I found involving the new behavior of the / operator (i.e. it returns a float), though this is fixed now.

Another new change is to the index_mode context manager--the "copy_on_getitem" mode now properly affects only the supplied table rather than tampering with BaseColumn directly. Michael's workaround is to change the __class__ attribute of each relevant column to a subclass (either _GetitemColumn or _GetitemMaskedColumn) with the correct __getitem__ method, and this should rule out possible unlikely side effects. Aside from this, I've also been looking into improving the performance of the engines other than SortedArray. The main issue I see is that there's a lot of Python object creation in the engine initialization, which unfortunately seems to be unavoidable given the constraints of the bintrees library. The success of SortedArray really lies in the fact that it deals with numpy arrays, so I'm looking into creating an ndarray-based binary search tree.

## August 04, 2015

### Where we left last time

Problem: Calculate $P(x'_i|\textbf{x}_{-i})$ efficiently for a Bayesian or a Markov network.

For the case of Markov networks, the expression comes out to be:
$$P(x'_i|\textbf{x}_{-i}) = \frac{\prod_{D_j \ni X_i} \phi_j(x'_i, \textbf{x}_{j, -i})}{\sum_{x''_i} \prod_{D_j \ni X_i} \phi_j(x''_i, \textbf{x}_{j, -i})}$$

The derivation is fairly straightforward and is elucidated in [1].

The case for Bayesian networks is analogous. Both these cases have been implemented in PR #457.

[1]  D. Koller and N. Friedman, "Probabilistic graphical models: principles and techniques," MIT Press, 2009, pp 512-513.

## August 03, 2015

### Goran Cetusic(GNS3)

#### Ubridge is great but...

In my last post I've talked about ubridge and how it's supposed to work with GNS3. The problem is that users generally need root permissions because GNS3 is basically creating a new (veth) interface on the host. You can't do this without some kind of special permission. Ubridge does this by using Linux capabilities and the setcap command. That's why when you do "make install" when installing ubridge you get:

Setting permissions on a file *once* is not a problem, ubridge is already used for VMware and this doesn't really conflict with how GNS3 works. So in GNS3 ubridge should create the veth interfaces for Docker, not GNS3. That's why the newest version of ubridge has some new cool features like hypervisor mode and creating and moving veth interfaces to other namespaces. Here's a quick example:

1. Start ubridge in hypervisor mode on port 9000:

./ubridge -H 9000

2. Connect in Telnet on port 9000 and ask ubridge to create a veth pair and move one interface to namespace of container

telnet localhost 9000
Trying 127.0.0.1...
Connected to localhost.
Escape character is '^]'
docker create_veth guestif hostif
100-veth pair created: guestif and hostif
docker move_to_ns guestif 29326
100-guestif moved to namespace 29326

3. Bridge the hostif interface to an UDP tunnel:

bridge create br0
100-bridge 'br0' created
100-NIO Linux raw added to bridge 'br0'
bridge add_nio_udp br0 20000 127.0.0.1 30000
100-NIO UDP added to bridge 'br0'
bridge start br0
100-bridge 'br0' started

That's the general idea of how it should work but I'm having some problems getting this to work on my Fedora installation in the docker branch. My mentors are being really helpful and are trying to debug this with me. I sent them the outputs from various components so here they are for you to get the wider picture.

Manual hypervisor check:
gdb) run -H 11111
Starting program: /usr/local/bin/ubridge -H 11111
Hypervisor TCP control server started (port 11111).
Destination NIO listener thread for bridge0 has started
Source NIO listener thread for bridge0 has started

GNS3 output:
bridge add_nio_udp bridge0 10000 127.0.0.1 10001
2015-08-01 13:11:22 INFO docker_vm.py:138 gcetusic-vroot-latest-1 has started
bridge add_nio_udp bridge0 10001 127.0.0.1 10000
2015-08-01 13:11:22 INFO docker_vm.py:138 gcetusic-vroot-latest-2 has started

Tcpdump output:
[cetko@nerevar gns3]$sudo tcpdump -i gns3-veth0ext tcpdump: verbose output suppressed, use -v or -vv for full protocol decode listening on gns3-veth0ext, link-type EN10MB (Ethernet), capture size 262144 bytes 12:06:51.995942 ARP, Request who-has 10.0.0.2 tell 10.0.0.1, length 28 12:06:52.998217 ARP, Request who-has 10.0.0.2 tell 10.0.0.1, length 28 [cetko@nerevar gns3]$ sudo tcpdump -i gns3-veth1ext
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on gns3-veth1ext, link-type EN10MB (Ethernet), capture size 262144 bytes

Netstat output:
udp        0      0 127.0.0.1:10000         127.0.0.1:10001         ESTABLISHED
udp        0      0 127.0.0.1:10001         127.0.0.1:10000         ESTABLISHED

### Aman Singh(Scikit-image)

#### Porting of function find_objects

I have made the first PR of my GSoC project which is the Porting of the function find_objects,  in measurement submodule of ndimage.  This is one of the most basic function of  ndimage module which finds any object from a labelled image.  It returns a slice object which we can be used on the image to find objects in image.  In porting this function we had several problems to deal with.  Firstly I had to make it  run on all range of devices running whether solaris or Ubuntu 14.10 and for both big endian and little endian machines.  So the first challenge was to manage byteswapped pointers. For this we had used two api’s in numpy. First one PyArray_ISBYTESWAPPED() is used to  check whether the given  pointer is byteswapped or not.  And the second one copyswap(), is used to convert a byteswapped pointer into a normal one which we can dereference normally.  Initially we had used this function but it was making the whole function look like  a proper C function. So we decided to use another high level api of numpy itself which was costlier than the original implementation (as it makes copy of the whole array) but it made the implementation more cythonish and easy to  maintainable. We have yet to do the bench-marking of this version and if results come good,  we are sticking to this new version.

Then there was another conflict regarding functions using  fused data type of input arrays. If variables are declared in the same file then using fused data type in file itself then it becomes very easy to use  fused data type. But writing a function with fused type coming from user is a very tedious task. We finally found a way of implementing it, but its very much complex and uses function pointers which makes it horrible to maintain. We are trying to find any alternative to it yet. I will in my next blog explain how I have used function pointers for fused data type where the type depends upon the input data given by the user.

Link of the PR is http://www.github.com/scipy/scipy/pull/5006

</Keep Coding>

### Prakhar Joshi(Plone)

#### Testing the transform

Hello everyone, now the transform for filtering html is ready and the main task is to test the transform. For that purpose I have set up the whole test environment for my add-on using testing.py for unit tests and robot tests.

After setting up the environment, now its time to first write unit test for transform that we have just created to check if they are all passing and the transform is working properly or not.

For creating unit test I first created test class and in that class I just call the convert function that we have created in the transform and give the input as a data stream and pass it to convert function and then get the output as required. After writing few simple test cases like 30-35 then just ran these test cases and they ran successfully.

Test cases ran successfully locally :-

Travis is also happy ;)

Yayayay!!! Finally test cases were passing so its like a milestone for the project and its completed. The PR got merged and things working good as expected.

Now its time to write more test cases and write robot test to pass the whole html page for the script and get the required output, though I tried that on script manually and it ran perfectly and now its time to write them in automated test form so that we can check them with just one command and to check if the transform is working perfectly or not.

Though these last two weeks were too pathetic as I was busy with the placement season and it helped as I got placed.

In the next blog I will write about how I implemented the robot test for the transform. Stay tunned.

Hope you enjoy!!

cheers,

### Chienli Ma(Theano)

#### OpfromGraph.c_code

In this two weeks connection_pattern and infer_shape were merged. I was supposted to implemented the GPU optimization feature. As I don’t have a machine with Nvidia GPU, I truned to the c_code method after several days.

Reusing the code of CLinker would be our original idea. But thing wuold not be that simple. CLinker generate code at a function scale which mean it treat node OpFromGraph as a function. Yet we want to generate code at a OP scale. Therefore we need to remove some code in a larger scale to make OpFromGraph.c_code() return a ‘node like’ c_code.

There’re to solution. One is to add a new feature to CLinker so that it can detect OpFromGraph node and do some special behavior. The other one is to avoid high level method like code_gen() in CLinker – only use auxilary method to assemble a ‘node like’ c_code. The later solution seems easier and I am working on it. CLinker is a very complicated class, I try to understand it and work out a workable code in the next two weeks.

Wish myself good luck~~!

## August 02, 2015

### Jaakko Leppäkanga(MNE-Python)

#### Interactive TFR

Last two weeks I've been working on interactive TFR and topography views. The interactivity comes from rectangle selector, which can be used for selecting time and frequency windows of TFR to draw a scalp view of the selected area. From scalp view it is now possible to select channels in similar manner to draw an averaged TFR of selected channels. Also fixed a couple of bugs.

The following weeks I'll be mostly finalizing the code as all the planned features are pretty much done.

### Shivam Vats(SymPy)

#### GSoC Week 10

I spent a good amount of time this week trying to make the series function more intelligent about which ring it operates on. The earlier strategy of using the EX ring proved to be slow in many cases. I had discussions with Kalevi Suominen, who is a developer at SymPy and we figured out the following strategy:

• The user inputs a Basic expr. We use sring over QQ to get the starting ring.

• We call individual functions by recursing on expr. If expr has a constant term we create a new ring with additional generators required by the ring (e.g sin and cos in case of rs_sin and expand expr over that.)

• This means that each ring_series function now can add generators to the ring it gets so that it can expand the expression.

This results in considerable speed-up as we do operations on the simplest possible ring as opposed to using EX which is the most complex (and hence slowest) ring. Because of this, the time taken by the series_fast function in faster_series is marginally more than direct function calls. The function doesn’t yet have code for handling arbitrary expressions, which will add some overhead of its own.

Most of the extra time is taken by sring. The overhead is constant, however (for a given expression). So for series_fast(sin(a**2 + a*b), a, 10) the extra routines take about 50% of the total time (the function is 2-4 x slower). For series_fast(sin(a**2 + a*b), a, 100), they take 2% of the total time and the function is almost as fast as on QQ

There is, of course scope for speedup in sring (as mentioned in its file). Another option is to minimise the number of calls to sring, if possible to just one (in the series_fast function).

In my last post I talked about the new solveset module that Harsh and Amit are working on. I am working with them and I sent a patch to add a domain argument to the solveset function. It is pretty cool stuff in that solution is always guaranteed to be complete.

### Next Week

I haven’t been yet been able to start porting the code to Symengine as the Polynomial wrappers are not yet ready. Hopefully, they should be done by the next week. Till, then I will focus on improving series_fast and any interesting issues that come my way.

• Write a fully functional series_fast. Profile it properly and optimise it.

• Polynomial wrappers.

• Document the functions and algorithms used in ring_series.py

Cheers!!

GSoC Week 10 was originally published by Shivam Vats at Me on August 02, 2015.

## August 01, 2015

### Ambar Mehrotra(ERAS Project)

#### GSoC 2015: 6th Biweekly Report

Hello Everyone! The last two weeks were quite exhausting and I couldn't get much work done due to some issues. Although, I did manage to make some very important bug fixes and some other feature additions.

Summary Deletion: A user can now delete summaries as well.
• Navigate to the branch.
• Select the summary you want to delete from the drop down menu in the summary tab.
• Click on "Edit" menu --> "Delete Summary".
With the implementation of this feature the user will be able to support multiple summaries in the GUI according to the need. There is no summary modification feature in case of branches since you can now both delete and add summaries.

Graphs for Branches: Another feature that I implemented over the past week was the graphs for branches. Earlier the graphs were supported only by the leaves, i.e., the data sources.
• Click on a branch in the tree.
• Go to the Graph tab.
• Select the child whose data you want to view.
I avoided putting graphs for all the children in the same window as in case of large number of children for a branch, it will lead to cluttering and chaos.

Bug Fixes: I mentioned in my earlier blog post about how a user can create multiple summaries for a branch and can view the required one. There were some serious problems with the implementation design of that feature which took a lot of time in fixing.

Video Tutorials: I also made some small tutorials to guide the user so as how to use the GUI. The tutorials describe how to get started, add devices and branches and what modifications you can make to them. Here is the link to the youtube play list for the tutorials Habitat Tutorials Playlist .

I have planned to work on alarms from the following week. Happy Coding.

#### Bayesian state space estimation in Python via Metropolis-Hastings

This post demonstrates how to use the (http://www.statsmodels.org/) tsa.statespace package along with the [PyMC](https://pymc-devs.github.io/pymc/) to very simply estimate the parameters of a state space model via the Metropolis-Hastings algorithm (a Bayesian posterior simulation technique). Although the technique is general to any state space model available in Statsmodels and also to any custom state space model, the provided example is in terms of the local level model and the equivalent ARIMA(0,1,1) model.

## July 31, 2015

### Siddhant Shrivastava(ERAS Project)

#### Telerobotics and Bodytracking - The Rendezvous

Hi! The past week was a refreshingly positive one. I was able to solve some of the insidious issues that were plaguing the efforts that I was putting in last week.

## Virtual Machine Networking issues Solved!

I was able to use the Tango server across the Windows 7 Virtual Machine and the Tango Host on my Ubuntu 14.04 Host Machine. The proper Networking mode for this turns out to be Bridged Networking mode which basically tunnels a connection between the Virtual Machine and the host.

In the bridged mode, the Virtual Machine exposes a Virtual Network interface with its own IP Address and Networking stack. In my case it was vm8 with an IP Address different from the IP Address patterns that were used by the real Ethernet and WiFi Network Interface Cards. Using bridged mode, I was able to maintain the Tango Device Database server on Ubuntu and use Vito's Bodytracking device on Windows. The Virtual Machine didn't slow down things by any magnitude while communicating across the Tango devices.

This image explains what I'm talking about -

In bridged mode, I chose the IP Address on the host which corresponds to the Virtual Machine interface - vmnet8 in my case. I used the vmnet8 interface on Ubuntu and a similar interface on the Windows Virtual Machine. I read quite a bit about how Networking works in Virtual Machines and was fascinated by the Virtualization in place.

## Bodytracking meets Telerobotics

With Tango up and running, I had to ensure that Vito's Bodytracking application works on the Virtual Machine. To that end, I installed Kinect for Windows SDK, Kinect Developer Tools, Visual Python, Tango-Controls, and PyTango. Setting a new virtual machine up mildly slowed me down but was a necessary step in the development.

Once I had that bit running, I was able to visualize the simulated Martian Motivity walk done in Innsbruck in a training station. The Bodytracking server created by Vito published events corresponding to the moves attribute which is a list of the following two metrics -

• Position
• Orientation

I was able to read the attributes that the Bodytracking device was publishing by subscribing to Event Changes to that attribute. This is done in the following way -

    while TRIGGER:
# Subscribe to the 'moves' event from the Bodytracking interface
moves_event = device_proxy.subscribe_event(
'moves',
PyTango.EventType.CHANGE_EVENT,
cb, [])
# Wait for at least REFRESH_RATE Seconds for the next callback.
time.sleep(REFRESH_RATE)


This ensures that the Subscriber doesn't exhaust the polled attributes at a rate faster than they are published. In that unfortunate case, an EventManagerException occurs which must be handled properly.

Note the cb attribute, it refers to the Callback function that is triggered when an Event change occurs. The callback function is responsible for reading and processing the attributes.

The processing part in our case is the core of the Telerobotics-Bodytracking interface. It acts as the intermediary between Telerobotics and Bodytracking - converting the position, and orientation values to linear and angular velocity that Husky can understand. I use a high-performance container from the collections class known as deque. It can act both as a stack and a queue using deque.append, deque.appendleft, deque.pop, deque.popleft.

To calculate velocity, I compute the differences between consecutive events and their corresponding timestamps. The events are stored in a deque, popped when necessary and subtracted from the current event values

For instance this is how linear velocity processing takes place -

  # Position and Linear Velocity Processing
position_previous = position_events.pop()
position_current = position
linear_displacement = position_current - position_previous
linear_speed = linear_displacement / time_delta


## ROS-Telerobotics Interface

We are halfway through the Telerobotics-Bodytracking architecture. Once the velocities are obtained, we have everything we need to send to ROS. The challenge here is to use velocities which ROS and the Husky UGV can understand. The messages are published ot ROS only when there is some change in the velocity. This has the added advantage of minimzing communication between ROS and Tango. When working with multiple distributed systems, it is always wise to keep the communication between them minimial. That's what I've aimed to do. I'll be enhacing the interface even further by adding Trigger Overrides in case of an emergency situation. The speeds currently are not ROS-friendly. I am writing a high-pass and low-pass filter to limit the velocities to what Husky can sustain. Vito and I will be refining the User Step estimation and the corresponding Robot movements respectively.

GSoC is only becoming more exciting. I'm certain that I will be contributing to this project after GSoC as well. The Telerobotics scenario is full of possibilities, most of which I've tried to cover in my GSoC proposal.

I'm back to my university now and it has become hectic but enjoyably challenging to complete this project. My next post will hopefully be a culmination of the Telerobotics/Bodytracking interface and the integration of 3D streaming with Oculus Rift Virtual Reality.

Ciao!

### Vito Gentile(ERAS Project)

#### Enhancement of Kinect integration in V-ERAS: Fifth report

This is my fifth report on what I have done for my GSoC project. If you don’t know what it is about and want to find more information, please refer to this page and this blog post.

After finalizing the user step estimation (which is still under test by Siddhant, and probably will require some refinements), during the last week I have also helped Yuval with some scripts to help him in analyzing users’ data. What I have done is mainly to aggregate Kinect and Oculus Rift data, and outputting them in a single file. This has been made possible by using the timestamps related to every single line in the files, in order to synchronize data from different files.

I have not committed these files yet, because Franco has also worked on these stuff; so probably he will commit everything in a while, as soon as he can.

The second (and more important, interesting and compelling) task that I have just finished to implement (although it will need some minor improvements) is to implement hand gesture recognition. This feature is not included in PyKinect, but it ships with the Microsoft Kinect Developer Toolkit, as part of what they called Kinect Interactions. Because PyKinect is based on the C++ Microsoft Kinect API, I have decided to implement this feature in this language (so that I can use the API, rather than reimplementing everything from scratch), and then port it in Python by mean of ctypes.

I had never used ctypes before, and this implied a lot of hard work, but at the end of the story I figure out how to use this powerful technology. Here there are some links useful to everyone wants to start using this technology:

The whole C++ module is stored in this directory of ERAS repository, and its output is a .dll file named KinectGestureRecognizer.dll. This file needs to be placed in the same directory of tracker.py before executing the body tracker, as well as the KinectInteraction180_32.dll. The latter ships with the Developer Toolkit, and it can be found in C:\Program Files\Microsoft SDKs\Kinect\Developer Toolkit v1.8.0\bin.

Then, I have written a Python wrapper by using ctypes; you can see it at this link. I had also tried to use ctypesgen to automatically generate Python wrapper from header files, but it didn’t seem easy to use to me (mainly due to some issues with the Visual Studio C++ compiler).

I also had to change some settings in order to enable Kinect Interactions to work in the proper way, and it implied to edit also tracker.py and gui.py. For instance, I had to change the depth resolution used, which is now 640×480 pixels, while it was 320×240.

Another script involved in the last commit was visualTracker.py. This file is very useful for testing purposes, because it allows you to see an avatar of the user, moving in 3D space. After adding gesture recognition, I decided to improve the avatar by coloring the hand joints in red if the hand is closed.

I have also helped the IMS with their current mission AMADEE, by setting up one of their machine (the Windows one). This way we can verify if what we have developed during these months works fine, by testing it in a real context with several users.

That’s it for the moment. I will update you soon for what about my GSoC project!

Cheers!

week sync 14

## Last week:

• ReWritten all gadgets graph parts using networkx library.
• Using the algorithms in networkx.algorithms instead of the previous codes.
• networkx.topological_sort() instead of ROP.__build_top_sort()
• networkx.all_shortest_paths() instead of ROP.__dfs().
• Filter all binaries as the rop-tools doing. regardless of its size. Important!!!
• search_path() return no more than 10 paths(shortest order), for performance.
• Update doctests and regular expression of filter.

It is much more faster now. We can finding gadgets and solving for setRegisters() within less than 10 seconds for most of the binaries, including the amoco's loading time which is about 2 seconds.

And classifier is the main bottleneck now.

## Next week:

• Fixing potential bugs.
• Aarch64 supported.

## July 29, 2015

### Sartaj Singh(SymPy)

#### GSoC: Update Week 8 and 9

It's been a long time since my last post. Holidays are now over and my classes have started. Last few days have been hectic for me. Here's the highlights of my last two weeks with SymPy.

### Highlights:

My implementation of the algorithm to compute formal power series is finally done. As a result #9639 finally got merged. Thanks Jim and Sean for all the help. As #9639 brought in all the necessary changes #9572 was closed.

In the SymPy master,

>>> fps(sin(x), x)
x - x**3/6 + x**5/120 + O(x**6)
>>> fps(1/(1-x), x)
1 + x + x**2 + x**3 + x**4 + x**5 + O(x**6)


On a side note, I was invited for Push access by Aaron. Thanks Aaron. :)

• Improve test coverage of series.formal.
• Start working on operations on Formal Power Series.

### Nikolay Mayorov(SciPy)

#### Robust nonlinear regression in scipy

The last feature I was working on is robust loss functions support. The results are again available as IPython Notebook, look here https://gist.github.com/nmayorov/dac97f3ed9d638043191 (I’m struggling to get “&” work correctly in LaTeX blocks, so the formatting is a bit off at the moment). The plan is to provide this example as tutorial for scipy.

## July 28, 2015

### Michael Mueller(Astropy)

#### Week 9

This week I spent quite a bit of time on mixin column support for indices, where appropriate. After first moving the indices themselves from a Column attribute to a DataInfo attribute (accessed as col.info.indices), I moved most of the indexing code for dealing with column access/modifications to BaseColumnInfo for mixin use in methods like __getitem__ and __setitem__. Since each mixin class has to include proper calls to indexing code, mixins should set a boolean value _supports_indices to True in their info classes (e.g. QuantityInfo). As of now, Quantity and Time support indices, while SkyCoord does not since there is no natural order on coordinate values. I've updated the indexing testing suite to deal with the new mixins.

Aside from mixins (and general PR improvements like bug fixes), I implemented my mentors' suggestion to turn the previous static_indices context manager into a context manager called index_mode, which takes an argument indicating one of three modes to set for the index engine. These modes are currently:

• 'freeze', indicating that table indices should not be updated upon modification, such as updating values or adding rows. After 'freeze' mode is lifted, each index updates itself based on column values. This mode should come in useful if users intend to perform a large number of column updates at a time.
• 'discard_on_copy', indicating that indices should not be copied upon creation of a new column (for example, due to calls like "table[2:5]" or "Table(table)").
• 'copy_on_getitem', indicating that indices should be copied when columns are sliced directly. This mode is motivated by the fact that BaseColumn does not override the __getitem__ method of its parent class (numpy.ndarray) for performance reasons, and so the method BaseColumn.get_item(item) must be used to copy indices upon slicing. When in 'copy_on_getitem' mode, BaseColumn.__getitem__ will copy indices at the expense of a reasonably large performance hit. One issue I ran into while implementing this mode is that, for special methods like __getitem__, new-style Python classes call the type's method rather than the instance's method; that is, "col[[1, 3]]" corresponds to something like "type(col).__getitem__(col, [1, 3])" rather than "col.__getitem__([1, 3])". I got around this by adjusting the actual __getitem__ method of BaseColumn in this context (and only for the duration of the context), but this has the side effect that all columns have changed behavior, not just the columns of the table supplied to index_mode. I'll have to ask my mentors whether they see this as much of an issue, because as far as I can tell there's no other solution.
At this point I see the PR as pretty much done, although I'll spend more time writing documentation (and making docstrings conform to the numpy docstring standard).

### Sahil Shekhawat(PyDy)

#### GSoC Week 10

Hi everyone, my last post was made at a very bad position. I had lost 3 days of work and was lagging behind my timeline. I was in a very bad mood than I am now. Now, I feel confident with this project because I am finally getting the hand of dynamics (the only issue).

### AMiT Kumar(Sympy)

#### GSoC : This week in SymPy #9

Hi there! It's been nine weeks into GSoC . Here is the Progress for this week.

### Progress of Week 9

This week I worked on Replacing solve with solveset or linsolve in the codebase: Here are the modules, I covered, as of now:

@moorepants pointed out that I should not change old solvetests, since people may break an untested code, this argument is valid, so I have added equivalent tests for solveset, where it is competent with solve.

There are some untested code in codebase as well, where solve is used, for those cases replacing has not been done, as the tests would pass anyway, since those lines are not tested. So I have added a TODO for those instances, to replace with solveset, when those lines are tested.

#### Other Work

I also changed the output of linsolve when no solution are returned, earlier it throwed ValueError & now it returns an EmptySet(), which is consistent with rest of the solveset. See PR #9726

### from future import plan Week #10:

This week I plan to Merge my pending PR's on replacing old solve in the code base with solveset, and work on Documentation & lambertw solver.

### $git log PR #9726 : Return EmptySet() if there are no solution to linear system PR #9724 : Replace solve with solveset in core PR #9717 : Replace solve with solveset in sympy.calculus PR #9716 : Use solveset instead of solve in sympy.sets PR #9717 : Replace solve with solveset in sympy.series PR #9710 : Replace solve with solveset in sympy.stats PR #9708 : Use solveset instead of solve in sympy.geometry PR #9587 : Add Linsolve Docs PR #9500 : Documenting solveset That's all for now, looking forward for week #10. :grinning: ## July 27, 2015 ### Yask Srivastava(MoinMoin) #### Admin and editor enhancements in wiki Last week I was down with Chicken Pox. I had to take tons of medicies :\ . Thankfully, I recovered. Recently I worked on restricted Admin page. Only a super user has access to administrative functions. To become the useruser add the following line in wikiconfig.py. The screenshots after changes. Other apparent changes from the screenshots are wider spread navbar and footers with bluish background. This was done to give it distictive look from basic theme. ### Editor Changes Currently MoinMoin has a dull editor. It’s looks more like a simple text box than a editor. Thus basic toolbar for markdown,creole,html.. etc is an essential feature we are missing. I used MarkItUp Javascript plugin to quickly set up the edior like features for our Markdown wiki. The beautiful thing about this plugin is that it enables us to easily modify toolbar setting by modifying set.js file. This enables us to make editor that works for multiple syntax languages. Thus we can easily load up different set.js file for different content type of editor. This is how it looks in Markdown editor. This is how it look! http://i.imgur.com/bbLd5Ry.png"> RogerHaase tested this today . Things in editor aren’t fully functional yet as I am still in the process of integrating it. ### ToDos: • QuickLinks • Error Notification Styling Commits made last week: • 953a8cd Local history page themed • c6f8ed4 Fixed indicator color bug in usersetting • ecb9cfa Enhanced breadcrumbs in basic theme • 726692b stretched topnav and header ### Wei Xue(Scikit-learn) #### GSoC Week 8, 9 and Progress Report 2 ## Week 8 and 9 In the week 8 and 9, I implemented DirichletProcessGaussianMixture. But its behavior looks similar to BayesianGaussianMixture. Both of them can infer the best number of components. DirichletProcessGaussianMixture took a slightly more iteration than BayesianGaussianMixture to converge on Old-faith data set, around 60 iterations. If we solve Dirichlet Process Mixture by Gibbs sampling, we don't need to specify the truncated level T. Only the concentration parameter$\alpha$is enough. In the other hand, with variational inference, we still need to specify the maximal possible number of components, i.e., the truncated level. At the first, the lower bound of DirichletProcessGaussianMixture seems a little strange. It is not always going up. When some clusters disappear, it goes down a little bit, then go up straight. I think it is because the estimation of the parameters is ill-posed when these clusters have data samples less than the number of features. I did the math derivation of Dirichlet process mixture models again, and found it was a bug on the coding of a very long equation. I also finished the code of BayesianGaussianMixture for 'tied', 'diag' and 'spherical' precision. My mentor pointed out the style problem in my code and docstrings. I knew PEP8 convention, but got no idea where was also a convention for docstring, PEP257. It took me a lot of time to fix the style problem. ## Progress report 2 During the last 5 weeks (since the progress report 1), I finished the 1. GaussianMixutre with four kinds of covariance 2. Most test cases of GaussianMixutre 3. BayesianGaussianMixture with four kinds of covariance 4. DirichletProcessGaussianMixture Although I spent some time on some unsuccessful attempts, such as decoupling out observation models and hidden models as mixin classes, double checking DP equations, I did finished the most essential part of my project and did some visualization. In the following 4 weeks, I will finish all the test cases for BayesianGaussianMixture and DirichletProcessGaussianMixture, and did some optional tasks, such as different covariance estimators and incremental GMM. ### Lucas van Dijk(VisPy) #### GSoC 2015: Arrows and networks update A few weeks of development have passed, time for another progress report! The past few weeks a lot of things have been added and/or improved: • Finished the migration the new scenegraph and visual system. • Added another example on how to use the ArrowVisual API (a quiver plot) • Improved formatting and documentation of the ArrowVisual and Bezier curves code • Added some tests for the ArrowVisual ## New scenegraph system This pull request is almost ready to merge, and it's an huge update on the scenegraph and visuals system. The ArrowVisual is now completely ported to this new system. ## New Quiver Plot Example I've created a new example on how to use the ArrowVisual API. It's a quiver plot showed below: The arrows will always point towards the mouse cursor. ### Ziye Fan(Theano) #### [GSoC 2015 Week 7&8] In Week 7 and Week 8, I am mainly working on the optimization of local_fill_sink. This is more complicated than I thought. For details, check discussions here. When the code of this PR is being used, the time costed by "canonicalize" is less than the original code. (12 passes in 166 seconds --> 10 passes in 155 seconds, tested with the user case on my computer). But these changes make theano fails on a test case, "test_local_mul_switch_sink". This test case is to check if the optimizer "local_mul_switch_sink" behaves correctly. Why does it fail? For short, in this test there is a fgraph like " (* (switch ...) (switch ...) )", if this optimizer is applied correctly, the mul op will sink under the switch op, so that expression like "(* value_x Nan)" can be avoided and end up with right result. What stops the optimizer is the assert node inserted into the graph. What I am working on now is to make MergeOptimizer deal with nodes with assert inputs. Of course this is already another optimization. For the failed test case, one way is to modify the code of local_mul_switch_sink, make it able to applied through assert nodes, but this is not a good way because it is not general. Please reply here or send me email if you have any idea or comments. Thanks very much. #### [GSoC 2015 Week 5&6] In the Week 5 and Week 6, I was working on a new feature for debugging, to display the names of rejected optimizers when doing "replace_and_validate()". The PR is here To implement this feature in one place in the code, the python library "inspect" is used. Anytime validation fails, the code will inspect the current stack frames and get to know which optimizer is the caller and whether there is "verbose" flag. Then the information for debugging can be displayed. Besides, optimization for the local_fill_sink optimizer is also began here, the main idea on local_fill_sink is to make "fill" to sink in a recursive way to be more efficiency. For the inplace_elemwise optimizer, it is not merged because of the bad performance. ### Mark Wronkiewicz(MNE-Python) #### Paris Debriefing C-day + 62 I just returned a few days ago from the MNE-Python coding sprint in Paris. It was an invigorating experience to work alongside over a dozen of the core contributors to our Python package for an entire week. Putting a face and personality to all of the github accounts I have come to know would have made the trip worthwhile on it's own, but it was also a great experience to participate in the sprint by making some strides toward improving the code library too. Although I was able to have some planning conversations with my GSoC mentors in Paris (discussed later), my main focus for the week was focused on goals tangential to my SSS project. Along with a bright student in my GSoC mentor’s lab, I helped write code to simulate raw data files. These typically contain the measurement data directly as they come off the MEEG sensors, and our code will allow the generation of a raw file for an arbitrary cortical activation. It has the option to include artifacts from the heart (ECG), eye blinks, and head movement. Generating this type of data where the ground truth is known is especially important for creating controlled data to evaluate the accuracy of source localization and artifact rejection methods – a focus for many researchers in the MEEG field. Luckily, the meat of this code was previously written by a post-doc in my lab for an in-house project – we worked on extending and molding it into a form suitable for the MNE-Python library. The trip to Paris was also great because I was able to meet my main GSoC mentor and discuss the path forward for the SSS project. We both agreed that my time would be best spent fleshing out all the add-on features associated with SSS (tSSS, fine-calibration, etc.), which are all iterative improvements on the original SSS technique. The grand vision is to eventually create an open-source implementation of SSS that can completely match Elekta’s proprietary version. It will provide more transparency, and, because our project is open source, we have the agility to implement future improvements immediately since we are not selling a product subject to regulation. Fulfilling this aim would also add one more brick to the wall of features in our code library. ### Keerthan Jaic(MyHDL) #### MyHDL GSoC Update After a long winding road, MyHDL v0.9.0 has been released with many new features! Since the release, I’ve been focusing on major, potentially breaking changes to MyHDL’s core for v1.0 I’ve submitted a PR which lays the groundwork for streamlined AST parsing by centralizing AST accesses and reusing ast.NodeVisitor s across the core decorators. While this PR is being revieiwed, I’m carefully examining MyHDL’s conversion modules in order to centralize symbol table access. I have also been working on improving MyHDL’s conversion tests using pytest fixtures to enable isolation and parallelization. ## July 26, 2015 ### Rupak Kumar Das(SunPy) #### Update Hello all! Let me summarize my progress. In the last couple of weeks, I worked on a new feature for Ginga – Intensity Scaling. It basically scales the intensity values relative to the first image so that the changes in brightness of the images can be measured. With a few small fixes from me, Eric and I have improved some parts of Ginga like the Cuts plugin and the auto-starting of the MultiDim plugin according to whether the FITS file is multidimensional or not. I have improved the save support branch by fixing all sorts of silly bugs. Although I could not get it to work with OpenCv, the save as movie code nevertheless works pretty well and fast. Now, my focus lies on the Slit and Line Profile plugins which basically need some clean-up. This is the last week of my summer vacation before my college opens. Although it doesn’t seem likely to be a problem, I will try to complete some important parts before that. Cheers! ### Shivam Vats(SymPy) #### GSoC Week 9 Like I said in my last post, this was my first week in college after summer vacation. I had to reschedule my daily work according to my class timings (which are pretty arbitrary). Anyway, since I do not have a test anytime soon, things were manageable. ### So Far #### Ring Series This week I worked on rs_series in PR 9614. As Donald Knuth succinctly said, ‘Premature optimisation is the root of all evil’, my first goal was to write a function that used ring_series to expand Basic expressions and worked in all cases. That has been achieved. The new function is considerably faster than SymPy’s series in most cases. eg. In [9]: %timeit rs_series(sin(a)*cos(a) - exp(a**2*b),a,10) 10 loops, best of 3: 46.7 ms per loop In [10]: %timeit (sin(a)*cos(a) - exp(a**2*b)).series(a,0,10) 1 loops, best of 3: 1.08 s per loop  However, in many cases the speed advantage is not enough, especially considering that all elementary ring_series functions are faster than SymPy’s series functions by factors of 20-100. Consider: In [20]: q Out[20]: (exp(a*b) + sin(a))*(exp(a**2 + a) + sin(a))*(sin(a) + cos(a)) In [21]: %timeit q.series(a,0,10) 1 loops, best of 3: 2.81 s per loop In [22]: %timeit rs_series(q,a,10) 1 loops, best of 3: 3.99 s per loop  In this case, rs_series is in fact slower than the current series method!. This means that rs_series needs to be optimised, as expanding the same expression directly with rs_* functions is much faster. In [23]: %timeit (rs_exp(x*y,x,10) + rs_sin(x,x,10))*(rs_exp(x**2+ x,x,10) + rs_sin(x,x,10))*(rs_sin(x,x,10) + rs_cos(x,x,10)) 1 loops, best of 3: 217 ms per loop  I spent Friday playing with rs_series. Since the function is recursive, I even tried using a functional approach (with map, reduce, partial, etc). It was fun exploring SymPy’s functional capabilities (which are quite decent, though Haskell’s syntax is of course more natural). This didn’t make much difference in speed. Code profiling revealed that rs_series is making too many function calls (which is expected). So, I plan to try a non-recursive approach to see if that makes much of a difference. Other than that, I will also try to make it smarter so that it does not go through needless iterations (which it currently does in many cases). #### SymEngine I had a discussion with Sumith about Polynomial wrappers. I am helping him with constructors and multiplication. We both want the basic Polynomial class done as soon as possible, so that I can start with writing series expansion of functions using it. I also sent a PR 562 that adds C wrappers for Complex class. This will be especially helpful for Ruby wrappers that Abinash is working on. FQA is a nice place to read about writing C++/C wrappers and for some side entertainment too. Other than that, I also happened to have a discussion with Harsh on the new solve-set he and Amit are working on. Their basic idea is that you always work with sets (input and output) and that the user can choose what domain he wants to work on. The latter idea is quite similar to what SymPy’s polys does. Needless to say, their approach is much more powerful that solvers’s. I will be working with them. ### Next Week Targets for the next week are as modest as they are crucial: • Play with rs_series to make it faster. • Finish Polynomial wrappers and start working on series expansion. Cheers! GSoC Week 9 was originally published by Shivam Vats at Me on July 26, 2015. ## July 25, 2015 ### Isuru Fernando(SymPy) #### GSoC Week 8 and 9 These two weeks me and Ondrej started adding support for different compilers. I added support for MinGW and MinGW-w64. There were some documented, but not yet fixed bugs in MinGW that I encountered. When including cmath, there were errors saying _hypot not defined, and off64_t not defied. I added flags -D_hypot=hypot -Doff64_t=_off64_t to fix this temporarily. With that symengine was successfully built. For python wrappers in windows, after building there was a wrapper not found error which was the result of not having the extension name as pyd in windows. Another problem faced was that, python distribution's libpython27.a for x64 was compiled for 32 bit architecture and there were linking errors. I found some patched files at http://www.lfd.uci.edu/~gohlke/pythonlibs/#libpython and python wrappers were built successfully. Also added continuous integration for MinGW using appveyor. With MinGW, to install gmp all you had to do was run the command mingw-get install mingw32-gmp. For MinGW-w64, I had to compile gmp. For this appveyor came in handy. I started a build in appveyor, stopped it and then logged into the appveyor machine remotely using remmina (Each VM was shutdown after 40 minutes. Within that 40 minutes you can login and debug the building). I compiled gmp using msys and mingw-w64 and then downloaded them to my machine. For appveyor runs, these pre-compiled binaries of gmp were used to test MinGW-w64 Ondrej and I worked together to make sure SymEngine could be built using MSVC in Debug mode. Since gmp couldn't be used out of the box in MSVC, we used MPIR project's sources which included visual studio project files. MPIR is a fork of GMP and provides MSVC support. We used it to build SymEngine in MSVC. Later I added support for Release mode and also added continuous integration for both build types and platform types. Python extension can also be built with MSVC. We are testing the Python extensions in Release mode only right now, because appveyor has only python release mode libraries and therefore when building the extension in Debug mode it gives an error saying python27_d.lib is not found. I also improved the wrappers for Matrix by adding __getitem__ and __setitem__ so that the matrices can be used easily in Python. Another improvement to SymEngine was the automatic simplification of expressions like 0.0*x and x**0.0. These expressions are not simplified more in master, so I 'm proposing a patch to simplify them to 0.0 and 1.0 respectively. ### Julio Ernesto Villalon Reina(Dipy) Progress Report Hi all, During this last three weeks I have been mainly designing, implementing, debugging and running tests for the brain tissue classification code. It is tough because you have to think of all possible options, input arguments, noise models that the end user may end up trying. I have learned a lot during this period, mainly because I have never tested any code so thoroughly and it has also come to my attention the importance of this kind of practice. You realize how “fragile” your code can be and how easy it is to make it fail. This has been a true experience of how to develop really robust software. Although my mentors and I decided to not move forward to the validation phase until having finished the testing phase, we decided to refactor the code (in part to make it also more robust as I was saying before) and to cythonize some loops that were causing the code to be slow. This has also been a interesting learning experience because I am practically new to cython and the idea of writing python-like code at the speed of C seems fascinating to me. I am planning to post more detailed information about the testing throughout this weekend. I will be working on finalizing this phase of the project the next couple of days and jump directly to the validation step. Keep it up and stay tuned! ### Prakhar Joshi(Plone) #### Updating the Transform Hello everyone, its been quite a long time since I updated the post, So finally here is the recent work that I have done in past few weeks. As in the last blog post I was able to create a new transform script using lxml and I have mentioned the way I implemented that script. As the code have been reviewed, then Jamie (mentor) pointed me a very important bug that I have comparing regular expression with the strings and not with the tags which was a bug in that script, so what I did is that when I am converting the whole input of string into a tree form and then iterating through every node and then replacing or removing the unwanted tags as required. How to work with tree and replace tags ? So basically what I did is I just took the whole document as a string and parse it into HTMLParser which converts the whole string into a tree like structure. So here in this tree we will have a parent node and then the child nodes and we will iterate through the whole tree and manipulate the nodes (or better call tags). In the lxml tree structure the nodes are filled with elements (or tags) and we can then iterate over the tree and check for node and do manipulations accordingly. Also we can get the content between the tags using tag.text method. So what I did here is first created a tree like this :- parser = etree.HTMLParser() tree = etree.parse(StringIO(html), parser) Then it will create a tree and the tree variable right now is an object which tell us the address where this tree is stored when we print the tree. Now we have a tree object so basically what we need is to iterate over the tree and this is bit easy work done like this :- for element in tree.getiterator(): if(element.tag == 'h3' or element.tag == 'h4' or element.tag == 'h5' or element.tag == 'h6' or element.tag == 'div'): element.tag = 'p' if(element.tag == "html" or element.tag == "body" or element.tag == "script"): etree.strip_tags(tree, element.tag) So this way we can iterate over the nodes and we can play with tags. Why the cleaner function ? After that we will convert the whole tree into string and then we will pass it to the cleaner function and here we will clean the html by removing Nasty tags and keeping only Valid tags and the cleaner function will again return a string of filtered html. We will give string to cleaner function so we will first convert the tree into the string and this is how it is done :- result = etrree.tostring(tree.getroot(), pretty_print=True, method="html") After that we will pass the result to the cleaner function where the string will be cleaned or filtered like this :- NASTY_TAGS = frozenset(['style', 'script', 'object', 'applet', 'meta', 'embed']) cleaner=HTMLParser(kill_tags=NASTY_TAGS,page_structure=False, safe_attrs_only=False) safe_html = fragment_fromstring(cleaner.clean_html(result)) Here we have also created the fragments of the cleaned string. Why to fragment the clean html string ? We will fragment the string so that we can remove the additionally added parent tag which usually created when we convert the string into the tree and it get appended and creates false results. So we create fragments of the single string and then again convert it into string. This seems like quite stupid to create fragment and convert it back to string but this is the way I found to remove extra tags. So after the final string we obtain is the final output of the transform and seems like all test cases are passing. Yayaya!! Its always good to see all the test cases passing. Hopefully you like reading this. Next time I will describe more about the testing part of the transform. Cheers, ### Stefan Richthofer(Jython) #### JyNI status update While for midterm evaluation the milestone focused on building the mirrored reference graph and detecting native reference leaks as well as cleaning them up I focused on updating the reference graph since then. Also I turned the GC-demo script into gc-unittests, see test_JyNI_gc.py. ### 32 bit (Linux) JNI issue For some reason test_JyNI_gc fails on 32 bit Linux due to seemingly (?) a JNI-bug. JNI does not properly pass some debug-info to Java-side, and causes a JVM crash. I spent over a day desperately trying several workarounds and double and triple checked correct JNI usage (the issue would also occur on 64 bit Linux if something was wrong here). The issue persists for Java 7 and 8, building JyNI with gcc or clang. The only way to avoid it seems to be passing less debug info to Java-side in JyRefMonitor.c. Strangely the issue also persists when the debug info is passed via a separate method call or object. However it would be hard or impossible to turn this into a reasonably reproducible JNI-bug report. For now I decided not to spend more time on this issue and remove the debug info right before alpha3 release. Until that release the gc-unittests are not usable on 32 bit Linux. Maybe I will investigate this issue further after GSOC and try to file an appropriate bug report. ### Keeping the gc reference-graph up to date I went through the C-source code of various CPython builtin objects and identified all places where the gc-reference graph might be modified. I inserted update-code to all these places, but it was only explicitly tested for PyList so far. All unittests and also the Tkinter demo still run fine with this major change. Currently I am implementing detection of silent modification of the reference graph. While the update code covers all JyNI-internal calls that modify the graph, there might be modifications via macros performed by extension code. To detect these, let's go into JyGC_clearNativeReferences in gcmodule. This is getting enhanced by code that checks the objects-to-be-deleted for consistent native reference counts. All counts should be explainable within this subgraph. If there are unexplainable reference counts, this indicates unknown external links, probably created by an extension via some macro, e.g. PyList_SET_ITEM. In this case we'll update the graph accordingly. Depending of the object type we might have to resurrect the corresponding Java object. I hope to get this done over the weekend. ### Manuel Paz Arribas(Astropy) #### Progress report My work on the last 3 weeks has been mainly in the container class for the cube background models (X, Y, energy). The class is called CubeBackgroundModel and the code has recently been merged to the master branch of Gammapy. The class has remodeled after a few code reviews from its first draft as in the post on Friday, June 19, 2015. For instance it can read/write 2 different kind of FITS formats: • FITS binary tables: more convenient for storing and data analysis. • FITS images: more convenient for the visualization, using for instance DS9. For the records, FITS is a standard data format largely used in astronomy. In addition, the plotting methods have been also simplified to allow a more customizable API for the user. Now only one plot is returned by the methods, and the user can easily combine the plots as desired with only a few lines of code using matplotlib. A new function has been added to the repository as well for creating dummy background cube models called make_test_bg_cube_model. This function creates a background following a 2D symmetric Gaussian model for the spatial coordinates (X, Y) and a power-law in energy. The Gaussian width varies in energy from sigma/2 to sigma. An option is also available to mask 1/4th of the Gaussian images. This option will be useful in the future, when testing the still-to-come reprojection methods, necessary for applying the background model to the analysis data to subtract the background. Since the models are produced in the detector coordinate system (a.k.a. nominal system), the models need to be projected to sky coordinates (i.e. Galactic, or RA/Dec) in order to apply them to the data. The work on the CubeBackgroundModel class has also triggered the development of other utility functions, for instance to create WCS coordinate objects for describing detector coordinates in FITS format or a converter of Astropy Table objects to FITS binary table ones. Moreover, a test file with a dummy background cube produced with the make_test_bg_cube_model tool has been placed in the gammapy-extra repository here for testing the input/output (read/write) methods of the class. This work has also triggered some discussions about some methods and classes in both the Astropy and Gammapy repositories. As a matter of fact, I am currently solving some of them, especially for the preparation of the release of the Gammapy 0.3 stable version in the coming weeks. In parallel I am also currently working on a script that should become a command-line program to produce background models using the data of a given gamma-ray astronomy experiment. The script is still on a first draft version, but the idea is to have a program that: 1. looks for the data (all observations of a given experiment) 2. filters out the observations taken on known sources 3. divides the data into groups of similar observation conditions 4. creates the background models and stores them to file In order to create the model, the following steps are necessary: • stack events and bin then (fill a histogram) • apply livetime correction • apply bin volume correction • smooth histogram (not yet implemented) A first glimpse on such a background model is shown in the following animated image (please click on the animation for an enlarged view): The movie shows a sequence of 4 images (X, Y), one for each energy bin slice of the cube. The image spans 10 deg on each direction, and the energy binning is defined between 0.01 TeV and 100 TeV, equidistant in logarithmic scale. The model is performed for a zenith angle range between 0 deg and 20 deg. There is still much work to do in order to polish the script and move most of the functionality into Gammapy classes and functions, until the script is only a few high-level calls to the necessary methods in the correct order. ## July 24, 2015 ### Siddhant Shrivastava(ERAS Project) #### Virtual Machines + Virtual Reality = Real Challenges! Hi! For the past couple of weeks, I've been trying to get a lot of things to work. Linux and Computer Networks seem to like me so much that they ensure my attention throughout the course of this program. This time it was dynamic libraries, Virtual Machine Networking, Docker Containers, Head-mounted display errors and so on. A brief discussion about these: ## Dynamic Libraries, Oculus Rift, and Python Bindings Using the open-source Python bindings for the Oculus SDK available here, Franco and I ran into a problem - ImportError: <root>/oculusvr/linux-x86-64/libOculusVR.so: undefined symbol: glXMakeCurrent  To get to the root of the problem, I tried to list all dependencies of the shared object file -  linux-vdso.so.1 => (0x00007ffddb388000) librt.so.1 => /lib/x86_64-linux-gnu/librt.so.1 (0x00007f6205e1d000) libpthread.so.0 => /lib/x86_64-linux-gnu/libpthread.so.0 (0x00007f6205bff000) libX11.so.6 => /usr/lib/x86_64-linux-gnu/libX11.so.6 (0x00007f62058ca000) libXrandr.so.2 => /usr/lib/x86_64-linux-gnu/libXrandr.so.2 (0x00007f62056c0000) libstdc++.so.6 => /usr/lib/x86_64-linux-gnu/libstdc++.so.6 (0x00007f62053bc000) libm.so.6 => /lib/x86_64-linux-gnu/libm.so.6 (0x00007f62050b6000) libgcc_s.so.1 => /lib/x86_64-linux-gnu/libgcc_s.so.1 (0x00007f6204ea0000) libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007f6204adb000) /lib64/ld-linux-x86-64.so.2 (0x00007f6206337000) libxcb.so.1 => /usr/lib/x86_64-linux-gnu/libxcb.so.1 (0x00007f62048bc000) libdl.so.2 => /lib/x86_64-linux-gnu/libdl.so.2 (0x00007f62046b8000) libXext.so.6 => /usr/lib/x86_64-linux-gnu/libXext.so.6 (0x00007f62044a6000) libXrender.so.1 => /usr/lib/x86_64-linux-gnu/libXrender.so.1 (0x00007f620429c000) libXau.so.6 => /usr/lib/x86_64-linux-gnu/libXau.so.6 (0x00007f6204098000) libXdmcp.so.6 => /usr/lib/x86_64-linux-gnu/libXdmcp.so.6 (0x00007f6203e92000) undefined symbol: glXMakeCurrent (./libOculusVR.so) undefined symbol: glEnable (./libOculusVR.so) undefined symbol: glFrontFace (./libOculusVR.so) undefined symbol: glDisable (./libOculusVR.so) undefined symbol: glClear (./libOculusVR.so) undefined symbol: glGetError (./libOculusVR.so) undefined symbol: glXDestroyContext (./libOculusVR.so) undefined symbol: glXCreateContext (./libOculusVR.so) undefined symbol: glClearColor (./libOculusVR.so) undefined symbol: glXGetCurrentContext (./libOculusVR.so) undefined symbol: glXSwapBuffers (./libOculusVR.so) undefined symbol: glColorMask (./libOculusVR.so) undefined symbol: glBlendFunc (./libOculusVR.so) undefined symbol: glBindTexture (./libOculusVR.so) undefined symbol: glDepthMask (./libOculusVR.so) undefined symbol: glDeleteTextures (./libOculusVR.so) undefined symbol: glGetIntegerv (./libOculusVR.so) undefined symbol: glXGetCurrentDrawable (./libOculusVR.so) undefined symbol: glDrawElements (./libOculusVR.so) undefined symbol: glTexImage2D (./libOculusVR.so) undefined symbol: glXGetClientString (./libOculusVR.so) undefined symbol: glDrawArrays (./libOculusVR.so) undefined symbol: glGetString (./libOculusVR.so) undefined symbol: glXGetProcAddress (./libOculusVR.so) undefined symbol: glViewport (./libOculusVR.so) undefined symbol: glTexParameteri (./libOculusVR.so) undefined symbol: glGenTextures (./libOculusVR.so) undefined symbol: glFinish (./libOculusVR.so)  This clearly implied one thing - libGL was not being linked. My task then was to somehow link libGL to the SO file that came with the Python Bindings. I tried out the following two options - • Creating my own bindings: Tried to regenerate the SO file from the Oculus C SDK by using the amazing Python Ctypesgen. This method didn't work out as I couldn't resolve the header files that are requied by Ctypesgen. Nevertheless, I learned how to create Python Bindings and that is a huge take-away from the exercise. I had always wondered how Python interfaces are created out of programs written in other languages. • Making the existing shared object file believe that it is linked to libGL: So here's what I did - after a lot of searching, I found the nifty little environment variable that worked wonders for our Oculus development - LD_PRELOAD As this and this articles delineate the power of LD_PRELOAD, it is possible to force-load a dynamically linked shared object in the memory. If you set LD_PRELOAD to the path of a shared object, that file will be loaded before any other library (including the C runtime, libc.so). For example, to run ls with your special malloc() implementation, do this: $ LD_PRELOAD=/path/to/my/malloc.so /bin/ls

Thus, the solution to my problem was to place this in the .bashrc file -

LD_PRELOAD="/usr/lib/x86_64-linux-gnu/libGL.so"

This allowed Franco to create the Oculus Test Tango server and ensured that our Oculus Rift development efforts continue with gusto.

On the programming side, I've been playing around with actionlib to interface Bodytracking with Telerobotics. I have created a simple walker script which provides a certain degree of autonomy to the robot and avoids collissions with objects to override human teleoperation commands. An obstacle could be a Martian rock in a simulated environment or an uneven terrain with a possible ditch ahead. To achieve this, I use the LaserScan message and check for the range readings at frequent intervals. The LIDAR readings ensure that the robot is in one of the following states -

• Approaching an obstacle
• Going away from an obstacle
• Hitting an obstacle

The state can be inferred from the LaserScan Messages. A ROS Action Server then waits for one of these events to happen and triggers the callback which tells the robot to stop, turn and continue.

## Windows and PyKinect

In order to run Vito's bodytracking code, I needed a Windows installation. Running into problems with a 32-bit Windows 7 Virtual Machine image I had, I needed to reinstall and use a 64-bits Virtual Machine image. I installed all the dependencies to run the bodytracking code. I am still stuck with Networking modes between the Virtual Machine and the Host machine. The TANGO host needs to be configured correctly to allow the TANGO_MASTER to point to the host and the TANGO_HOST to the virtual machine.

## Docker and Qt Apps

Qt applications don't seem to work with sharing the display in a Docker container. The way out is to create users in the Docker container which I'm currently doing. I'll enable VNC and X-forwarding to allow the ROS Qt applications to work so that the other members of the Italian Mars Society can use the Docker container directly.

## Gazebo Mars model

I took a brief look at the 3D models of Martial terrain available for free use on the Internet. I'll be trying to obtain the Gale Crater region and represent it in Gazebo to drive the Husky in a Martian Terrain.

## Documentation week!

In addition to strong-arming my CS concepts against the Networking and Linux issues that loom over the project currently, I updated and added documentation for the modules developed so far.

Hope the next post explains how I solved the problems described in this post. Ciao!

### Abraham de Jesus Escalante Avalos(SciPy)

#### A glimpse into the future (my future, of course)

Hello again,

Before I get started I just want to let you know that in this post I will talk about the future of my career and moving beyond the GSoC so this will only be indirectly related to the summer of code.

As you may or may not know, I will start my MSc in Applied Computing at the University of Toronto in September (2015, in case you're reading this in the future). Well, I have decided steer towards topics like Machine Learning, Computer Vision and Natural Language Processing.

While I still don't know what I will end up having has my main area of focus nor where this new journey will take me, I am pretty sure it will have to do with Data Science and Python. I am also sure that I will keep contributing to SciPy and most likely start contributing to other related communities like NumPy, pandas and scikit-learn so you could say that the GSoC has had a positive impact by helping me find areas that make my motivation soar and introducing me to people who have been working in this realm for a very long time and know a ton of stuff that make me want to pick up a book and learn.

In my latest meeting with Ralf (my mentor), we had a discussion regarding the growing scope of the GSoC project and my concern about dealing with all the unforeseen and ambiguous details that arise along the way. He seemed oddly pleased as I proposed to keep in touch with the project even after the "pencils down" date for the GSoC. He then explained that this is the purpose of the summer of code (to bring together students and organisations) and their hope when they choose a student to participate is that he/she will become a longterm active member of the community which is precisely what I would like to do.

I have many thanks to give and there is still a lot of work to be done with the project so I will save the thank you speech for later. For now I just want to say that this has been a great experience and I have already gotten more out of it than I had hoped (which was a lot).

Until my next post,
Abraham.

#### Progress Report

Hello all,

A lot of stuff has happened in the last couple of weeks. The project is coming along nicely and I am now getting into some of the bulky parts of it.

There is an issue with the way NaN (not a number) checks are handled that spans beyond SciPy. Basically, there is no consensus on how to deal with NaN values when they show up. In statistics they are often assumed to be missing values (e.g. there was a problem when gathering statistic data and the value was lost), but there is also the IEEE NaN which is defined as 'undefined' and can be used to indicate out-of-domain values that may point to a bug in one's code or a similar problem.

Long story short, the outcome of this will largely depend on the way projects like pandas and Numpy decide to deal with it in the future, but right now for SciPy we decided that we should not get in the business of assuming that NaN values signify 'missing' because that is not always the case and it may end up silently hiding bugs, leading to incorrect results without the user's knowledge. Therefore, I am now implementing a backwards compatible API addition that will allow the user to define whether to ignore NaN values (asume they are missing), treat them as undefined, or raise an exception. This is a longterm effort that may span through the entire stats module and beyond so the work I am doing now is set to spearhead future development.

Another big issue is the consistency of the scipy.stats module with its masked arrays counterpart scipy.mstats. The implementation will probably not be complicated but it encompasses somewhere around 60 to 80 functions so I assume it to be a large and time consuming effort. I expect to work on this for the next month or so.

During the course of the last month or two there have been some major developments in my life that are indirectly related to the project so I feel like they should be addressed but I intend do so in a separate post. For now I bid you farewell and thank you for reading.

Cheers,
Abraham.

### Abhijeet Kislay(pgmpy)

#### Progress report after mid-term

Woah! I am enjoying every bit of coding now. I have figured out the whole algorithm from the basics! I had been a fool all the time :) . Though I still think that the current understanding couldn’t have been possible if I didn’t made all the mistakes! So Thumbs Up! Spending time with the […]

### Siddharth Bhat(VisPy)

#### GSoC VisPy Week 6

As usual, here’s the update for week 6. Let’s get down to it!

## SceneGraph Overhaul

The fabled PR is yet to be closed, but we have everything we need for it to be merged. There were 2 remaining (outstanding) bugs related to the Scenegraph - both stemming from the fact that not all uniforms that were being sent to the shader were being used correctly. One of these belonged to the MeshVisual, a Visual that I had ported, so tracking this down was relatively easy. The fix is waiting to be merged.

The other one was a shader compliation bug and was fixed by Eric

Once Luke Campagnola is back, these changes should get merged, and the PR should be merged as well. That would be closure to the first part of my GSoC project.

## Plotting API

The high level plotting API has been coming together - not at the pace that I would have loved to see, but it’s happening. I’ve been porting the ColorBarVisual so it can be used with a very simple, direct API. I’d hit some snags on text rendering, but it was resolved with Eric’s help.

Text rendering was messed up initially, as my code wasn’t respecting coordinate systems. I rewrote the buggy code I’d written to take the right coordinate systems into account.

Another bug arose from the problem that I wasn’t using text anchors right. I’d inverted what a top anchor and a bottom anchor does in my head. The top anchor makes sure that all text is placed below it, while the bottom anchor pushes text above itself. Once that was fixed, text rendered properly.

However, There’s still an inconsistency. I don’t fully understand the way anchors interact with transforms. The above stated solution works under translation / scaling, but breaks under rotation. Clearly, there are gaps in my knowledge. I’ll be spending time fixing this, but I’m reasonably confident that it shouldn’t take too much time.

There was also a bug related to bounding box computation that was caught in the same PR, which I’ve highlighted.

## Inetellisense / Autocomplete

There’s a module in VisPy called vispy.scene.visuals whose members are generated by traversing another module (vispy.visuals) and then wrapping (“decorating”, for the Design Patterns enthusiasts) the members of vispy.visuals.

Since this is done at run-time (to be more explicit, this happens when the module is initialized), no IDE / REPL was able to access this information. This caused autocomplete for vispy.scene.visuals to never work.

After deliberating, we decided to unroll the loop and hand-wrap the members, so that autocomplete would work.

This is a very interesting trade-off, where we’re exchanging code compactness / DRY principles for usability.

Here’s the pull request waiting to be merged.

## Odds and Ends

I’ve been meaning to improve the main github page of VisPy, so it provides more context and development information to developers. There’s an open PR that I want to get merged by the end of this week.

That’s all for now! Next week should see the ColorBar integrated into vispy.plot. I’ll hopefully be working on layout using Cassowary by this time next week, but that’s just a peek into the future :). Adios!

### Aman Jhunjhunwala(Astropy)

#### GSOC ’15 Post 4 : AstroPython Preview Phase Begins

Four weeks left into what has been an amazing Summer of Code, we are now ready for a limited preview launch ! But before that summarizing the progress of past 2-3 weeks :-

The time has been used to mature the site and plug in any and every remaining hole!

• The post – midterm phase  began with setting up RSS / Atom Feeds for our website.
• This was quickly followed by integrating Sprint Forum for Q&A on our application , which took a long time to integrate and was removed later as it didn’t  quite fit in the concept of the site
•  The “Package Section” of the site has gone through multiple overhauls of design and finally an old-school table list view was chosen to best fit in the cause !
• Advanced Filtering Mechanisms in place ! When displaying all the posts, options can now be combined (Eg. Sort by Ratings + Tags:Python,ML+ From Native Content )
• Added Blog Aggregating Feature ( Only through admin – Adding the RSS/ATOM Feed address of a particular website and the section in which the posts should appear (NEWS for News Website,etc)
• Added Edit Ability to any content. A pencil-like clipart will appear on hovering an editable section, which if clicked allows you to edit that section
• Added Timeline feature (in TIMELINE section ) Posts are now displayed in time-order.
• If abstract is absent, display the first 4 lines of the post.This was a bit difficult as all the HTML and Markdown syntax needed to be removed
• Better way to add Tags in Creation Form. If tags are present in a section, it displays all the tags that are present . Clicking the tags adds the tags to the for
• Deployment to production server – comes with its own problems and bugs !
• Framed a deployment guide consisting of a step-by-step guide of setting up the server and configuring the necessary system files.
• Setting up our Backup & Restore Mechanisms (Only Partial Testing done). For Backup all files are stored in JSON format in /backup folder and are then restored using the restore script. It is run on a cron task so that the backup stays up-to-date
• The mail server was set up in collaboration with SendGrid to send Moderation Approval/Rejection emails, notifications,etc to the moderators ,admins and users alike.Every-time a post is added , the moderators are informed and every time a post is approved, its authors are informed !
• Added “Live Coding Area” on our HomePage to introduce Python to new learners (using Skulpt and Trinket) ! Users can code with Python right on the Homepage
• Added a periodic newsletter delivery mechanism which was again removed for later – if /when the site’s popularity rises
• Lots of major / minor CSS changes including favicon generation,etc and bug fixing ! A huge amount of time was spent on this !

For the limited preview phase, we invited the Astropy community(through their mailing list) to review the website and share their feedback with us . The invitation mail goes as :-

This summer we have been fortunate to have a very talented Google Summer of Code student, Aman Jhunjhunwala, working to create a brand new astropython.org site from scratch.  In addition to writing all the code, Aman has also brought a fresh design look and many cool ideas to the project.

The primary driver is to make a modern site that would engage the community and succeed in having a broad base of contributors to the site content.  To that end the site allows contributions from anyone who authenticates via Github, Google, or Facebook.  By default all such content will be moderated before appearing on the live site, but there is also a category for trusted users that can post directly.  During the preview phase moderation is disabled, so you can post at will!

The preview version is available at:

At this time we are opening the site for a preview in order to get feedback from the AstroPython community regarding the site including:

• Overall concept, in particular whether the site design will be conducive to broad base of contributors.  Would you want to post?
• Are the authentication options obvious and sufficient?
• General site organization and ease of navigating
• How easily can you find information?
• Visual design and aesthetics
• Other features, e.g. a Q&A forum?
• Accessibility
• Ease of posting
• Bugs
• Code review (https://github.com/x-calibre/astropython)
• Security
We highly encourage you to post content in any / all of the categories.  You can either try to play nice and use things as we intended, or do stress testing and try to break it.  All posts will show up immediately, so please do act responsibly in terms of the actual content you post.
As you find issues or have comments, please first check the site github repo to see if it is already known:
Ideally put your comments into github.  For most of the broad review categories above I have already created a placeholder issue starting with [DISCUSS].  Also, if you’d rather not use github then just send to me directly at taldcroft@gmail.com.
Best regards,
Tom Aldcroft,
Aman Jhunjhunwala,
Jean Connelly, and
Tom Robitaille
Next update in about 15 days , when we end our preview phase and hopefully push through our final deployment !
Aman Jhunjhunwala

### Jakob de Maeyer(ScrapingHub)

#### The add-on system in action

In my earlier posts, I have talked mostly about the motivation and the internal implementation of Scrapy’s add-on system. Here, I want to talk about how the add-on framework looks in action, i.e. how it actually effects the users’ and developers’ experience. We will see how users are able to configure built-in and third-party components without worrying about Scrapy’s internal structure, and how developer’s can check and enforce requirements for their extensions. This blog entry will therefore probably feel a little like a documentation page, and indeed I hope that I can reuse some of it for the official Scrapy docs.

### From a user’s perspective

To enable an add-on, all you need to do is provide its path and, if necessary, its configuration to Scrapy. There are two ways to do this:

• via the ADDONS setting, and
• via the scrapy.cfg file.

As Scrapy settings can be modified from many places, e.g. in a project’s settings.py, in a Spider’s custom_settings attribute, or from the command line, using the ADDONS setting is the preferred way to manage add-ons.

The ADDONS setting is a dictionary in which every key is the path to an add-on. The corresponding value is a (possibly empty) dictionary, containing the add-on configuration. While more precise, it is not necessary to specify the full add-on Python path if it is either built into Scrapy or lives in your project’s addons submodule.

This is an example where an internal add-on and a third-party add-on (in this case one requiring no configuration) are enabled/configured in a project’s settings.py:

ADDONS = {
'httpcache': {
'expiration_secs': 60,
'ignore_http_codes': [404, 405],
},
}


It is also possible to manage add-ons from scrapy.cfg. While the syntax is a little friendlier, be aware that this file, and therefore the configuration in it, is not bound to a particular Scrapy project. While this should not pose a problem when you use the project on your development machine only, a common stumbling block is that scrapy.cfg is not deployed via scrapyd-deploy.

In scrapy.cfg, section names, prepended with addon:, replace the dictionary keys. I.e., the configuration from above would look like this:

[addon:httpcache]
expiration_secs = 60
ignore_http_codes = 404,405



### From a developer’s perspective

Add-ons are (any) Python objects that provide Scrapy’s add-on interface. The interface is enforced through zope.interface. This leaves the choice of Python object up the developer. Examples:

• for a small pipeline, the add-on interface could be implemented in the same class that also implements the open/close_spider and process_item callbacks
• for larger add-ons, or for clearer structure, the interface could be provided by a stand-alone module

The absolute minimum interface consists of two attributes:

• name: string with add-on name
• version: version string (PEP-404, e.g. '1.0.1')

Of course, stating just these two attributes will not get you very far. Add-ons can provide three callback methods that are called at various stages before the crawling process:

###### update_settings(config, settings)

This method is called during the initialization of the crawler. Here, you should perform dependency checks (e.g. for external Python libraries) and update the settings object as wished, e.g. enable components for this add-on or set required configuration of other extensions.

###### check_configuration(config, crawler)

This method is called when the crawler has been fully initialized, immediately before it starts crawling. You can perform additional dependency and configuration checks here.

###### update_addons(config, addons)

This method is called immediately before update_settings(), and should be used to enable and configure other add-ons only.

When using this callback, be aware that there is no guarantee in which order the update_addon() callbacks of enabled add-ons will be called. Add-ons that are added to the add-on manager during this callback will also have their update_addons() method called.

Additionally, add-ons may (and should, where appropriate) provide one or more attributes that can be used for limited automated detection of possible dependency clashes:

• requires: list of built-in or custom components needed by this add-on, as strings

• modifies: list of built-in or custom components whose functionality is affected or replaced by this add-on (a custom HTTP cache should list httpcache here)

• provides: list of components provided by this add-on (e.g. mongodb for an extension that provides generic read/write access to a MongoDB database)

The main advantage of add-ons is that developers gain better control over how and in what conditions their Scrapy extensions are deployed. For example, it is now easy to check for external libraries and have the crawler shut down gracefully if they are not available:

class MyAddon(object):
version = '1.0'

def update_settings(self, config, settings):
try:
import boto
except ImportError:
raise RuntimeError("boto library is required")
else:
# Perform configuration


Or, to avoid unwanted interplay with other extensions and add-ons, or the user, it is now also easy to check for misconfiguration in the final (final!) settings used to crawl:

class MyAddon(object):
version = '1.0'

def update_settings(self, config, settings):

def check_configuration(self, config, crawler):
if crawler.settings.getbool('DNSCACHE_ENABLED'):
# The spider, some other add-on, or the user messed with the
# DNS cache setting
raise ValueError("myaddon is incompatible with DNS cache")


Instead of depending on the user to activate components and than gather configuration the global settings name space on initialization, it becomes feasible to instantiate the components ad hoc:

from path.to.my.pipelines import MySQLPipeline

version = '1.0'

def update_settings(self, config, settings):
settings.set(
'ITEM_PIPELINES',
{mysqlpl: 200},
)


Often, it will not be necessary to write an additional class just to provide an add-on for your extension. Instead, you can simply provide the add-on interface alongside the component interface, e.g.:

class MyPipeline(object):
name = 'mypipeline'
version = '1.0'

def process_item(self, item, spider):
# Do some processing here
return item

def update_settings(self, config, settings):
settings.set(
'ITEM_PIPELINES',
{self: 200},
)


### Richard Plangger(PyPy)

#### GSoC: Bilbao, ABC and off by one

Bilbao

I have never attended a programming conference before. Some thoughts and impressions:
• The architecture of conference center is impressive.
• Python is heavily used in numerical computation, data analysis and processing (I thought it to be less).
• Pick any bar and a vegetarian dish (if there is any): It will most certainly contain meat/fish
• PyPy is used, but most people are unaware of the fact that there is a JIT compiler for Python, that speeds up computations and reduces memory
It was a good decision to come to the EuroPython, meet with people (especially with the PyPy dev team) and see how things work in the Python community. See you next time :)

I did as well work on my proposal all along. Here are some notes what I have been working on (before Bilbao).

ABC Optimization

One "roadblock" I did not tackle is vectorization of "user code". The vecopt branch at it's current shape was not able to efficiently transform the most basic Python for loop accessing array instances of the array module.  (Micro)NumPy kernels work very well (which is the main use case), but for Python loops this is a different story. Obviously, it is not that easy to vectorize these, because it is much more likely that many guards and state modifying operations (other than store) are present.

In the worst case the optimizer just bails out and leaves the trace as it is.
But I think at least for the simplest loops should work as well.

So I evaluated what needs to be done to make this happen: Reduce the number of guards, especially Array Bound Checks (ABC). PyPy does this already, but the ones I would like to remove need a slightly different treatment. Consider:

i = 0
while i < X:
a[i] = b[i] + c[i]
i += 1

There are four guards in the resulting trace, one protecting the index to be below X, and three protecting the array access. You cannot omit them, but you can move them outside the loop. The idea is to introduce guards that make the checks (but the index guard) redundant. Here is an example:

guard(i < X) # (1) protects the index
guard(i < len(a)) # (2) leads ot IndexError

Assume X < len(a). Then (1) implies (2) is redundant and guard(X < len(a)) can be done before the loop is entered. That works well for a well behaved program and in order to pick the right guard as a reference (the index guard might not be the first guard), I'll take a look at the runtime value. The minimum value is preferable, because it is the strongest assumption.

I'm not yet sure if this is the best solution, but it is certainly simple and yields the desired result.

Off by one

Some commits ago the last few "off by one" iterations of a NumPy call were always handled by the blackhole interpreter, eventually compiling a bridge out of the guard. This makes the last iteration unnecessary slow. Now, PyPy has the bare essentials to create trace versions and immediately stitch them to an guarding instructions. The original version of the loop is compiled to machine code at the same time as the optimized version and attached to the guards within the optimized version.

Ultimately a vectorized trace exits an index guard immediately leading into a trace loop to handle the last remaining elements.

### Daniil Pakhomov(Scikit-image)

#### Google Summer of Code: Creating Training set.

I describe a process of the creating a dataset for training classifier that I use for Face Detection.

## Positive samples (Faces).

For this task I decided to take the Web Faces database. It consists of 10000 faces. Each face has eye coordinates which is very useful, because we can use this information to align faces.

Why do we need to align faces? Take a look at this photo:

If we just crop the faces as they are, it will be really hard for classifier to learn from it. The reason for this is that we don’t know how all of the faces in the database are positioned. Like in the example above the face is rotated. In order to get a good dataset we first align faces and then add small random transformations that we can control ourselves. This is really convinient because if the training goes bad, we can just change the parameters of the random transformations and experiment.

In order to align faces, we take the coordinates of eyes and draw a line through them. Then we just rotate the image in order to make this line horizontal. Before running the script the size of resulted images is specified and the amount of the area above and below the eyes, and on the right and the left side of a face. The cropping also takes care of the proportion ratio. Otherwise, if we blindly resize the image the resulted face will be spoiled and the classifier will work bad. That way we can be sure now that all our faces are placed cosistently and we can start to run random transformations. The idea that I described was taken from the following page.

Have a look at the aligned faces:

As you see the amount of area is consistent across images. The next stage is to transform them in order to augment our dataset. For this purpose we will use OpenCv create_samples utility. This utility takes all the images and creates new images by randomly transforming the images and changing the intensity in a specified manner. For my purposes I have chosen the following parameters -maxxangle 0.5 -maxyangle 0.5 -maxzangle 0.3 -maxidev 40. The angles specify the maximum rotation angles in 3d and the maxidev specifies the maximum deviation in the intesity changes. This script also puts images on the specified by user background.

This process is really complicated if you want to extract images in the end and not the .vec file format of the OpenCv.

This is a small description on how to do it:

1. Run the bash command find ./positive_images -iname "*.jpg" > positives.txt to get a list of positive examples. positive_images is a folder with positive examples.
2. Same for the negative find ./negative_images -iname "*.jpg" > negatives.txt.
3. Run the createtrainsamples.pl file like this perl createtrainsamples.pl positives.txt negatives.txt vec_storage_tmp_dir. Internally it uses opencv_createsamples. So you have to have it compiled. It will create a lot of .vec files in the specified directory. You can get this script from here. This command transforms each image in the positives.txt and places the results as .vec files in the vec_storage_tmp_dir folder. We will have to concatenate them on the next step.
4. Run python mergevec.py -v vec_storage_tmp_dir -o final.vec. You will have one .vec file with all the images. You can get this file from here.
5. Run the vec2images final.vec output/%07d.png -w size -h size. All the images will be in the output folder. vec2image has to be compiled. You can get the source from here.

You can see the results of the script now:

## Negative samples.

Negative samples were collected from the aflw database by eleminating faces from the images and taking random samples from the images. This makes sence because the classifier will learn negatives samples from the images where the faces usually located. Some people usually take random pictures of text or walls for negative examples, but it makes sence to train classifier on the things that most probably will be on the images with faces.

### Sumith(SymPy)

#### GSoC Progress - Week 9

Hello all. Last week has been rough, here's what I could do.

### Report

The printing now works, hence I could test them. Due to that we could even test both the constructors, one from hash_set and other from Basic.

The Polynomial wrappers PR, we need to get in quick, our highest priority.

We need to make the methods more robust, we plan to get it in this weekend.
Once this is in, Shivam can start writing function expansions.

I have also couple of other tasks:

• Use std::unordered_set so that we can have something even when there is no Piranha as dependency.
• Replace mpz_class with piranha::integer throughout SymEngine and checkout benchmarks.

I intend to get Polynomial in this weekend because I get free on weekends :)
As there are only 3-4 weeks remaining, I need to buck up.

That's all I have
Bidāẏa

### Yue Liu(pwntools)

#### GSOC2015 Students coding Week 09

week sync 13

## Last week:

• Single process optimization for load_gadgets() and build_graph()
• Multi Process supporting for GadgetFinder.load_gadgets()
• Multi Process supporting for ROP.build_graph()

Example for libc.so.6, which size larger than 200Kb.

 lieanu@ARCH $time python -c 'from pwn import *; context.clear(arch="amd64"); rop=ROP("/usr/lib/libc.so.6")' [*] '/usr/lib/libc.so.6' Arch: amd64-64-little RELRO: Partial RELRO Stack: Canary found NX: NX enabled PIE: PIE enabled python -c 44.18s user 3.04s system 301% cpu 15.655 total  Example for xmms2, < 200Kb  lieanu@ARCH$ ls -alh /bin/xmms2
-rwxr-xr-x 1 root root 133K Jun  4 18:27 /bin/xmms2
lieanu@ARCH $time python -c 'from pwn import *; context.clear(arch="amd64"); rop=ROP("/bin/xmms2")' [*] '/bin/xmms2' Arch: amd64-64-little RELRO: Partial RELRO Stack: Canary found NX: NX enabled PIE: No PIE python -c 86.14s user 1.05s system 305% cpu 28.545 total  • bottlenecks: 1. All graph operation, such as: top sort and dfs. 2. Classify when finding gadgets. ## Next week: • Optimization for graph operation. • Fixing potential bugs. ## July 23, 2015 ### Christof Angermueller(Theano) #### GSoC: Week six and seven Theano allows function profiling by setting the profile=True flag. After at least one function call, the compute time of each node can be then be printed with debugprint. However, analyzing complex graphs in this way can become cumbersome. d3printing allows now to graphically visualize the same timing information and hence to easily spot bottlenecks in Theano graphs! If the function has been profiled, a ‘Toggle profile colors’ button will appear on the top on the page. By clicking on it, nodes will be colored by their compute time. In addition, timing information can be retrieved by mouse-over event! You can find an example here, and the source code here. The second new feature is a context menu to edit the label of nodes and to release them from a fixed position. The next release will make it possible to visualize complicated nested graphs with OpFromGraph nodes. Stay tuned! The post GSoC: Week six and seven appeared first on Christof Angermueller. ## July 22, 2015 ### Palash Ahuja(pgmpy) #### Map inference in Dynamic Bayesian Network I am about to be finished with junction tree algorithm for inference in dynamic bayesian network. I am about to start with the map queries for inference. Currently, the map queries finds the maximum value using the maximize operation. But for the dynamic bayesian network we need to compute a path that has the maximum probability also called the viterbi path. The viterbi algorithm uses the famous dynamic programming algorithm paradigm, although it could be quite complicated to implement for a dynamic bayesian network. Also, the inference in junction tree will further optimize the scale of operations, so the variable elimination will not try to lag the algorithm down. I hope that the inference works well now. ## July 21, 2015 ### Nikolay Mayorov(SciPy) #### Large-Scale Bundle Adjustment As a demonstration of large-scale capabilities of the new least-squares algorithms we decided to provide an example of solving a real industry problem called bundle adjustment. The most convenient form is IPython Notebook, later it will serve as an example/tutorial for scipy (perhaps hosted on some server). Here I just give a link to the static version http://nbviewer.ipython.org/gist/nmayorov/6098a514cc3277d72dd7 ### Mridul Seth(NetworkX) #### GSOC 2015 PYTHON SOFTWARE FOUNDATION NETWORKX BIWEEKLY REPORT 4 Hello folks, this blog post will cover the work done in week 7 and week 8. Summer going really fast :) This period was dedicated towards merging iter_refactor branch to the master branch. The main issues were regarding documentation and improving it https://github.com/networkx/networkx/issues/1655 . We also discussed shifting the tutorial to ipython notebook https://github.com/networkx/networkx/pull/1663 and moving the current examples to ipython notebooks. I also took a dig at MixedGraph https://github.com/networkx/networkx/issues/1168 class. Cheers! ### Pratyaksh Sharma(pgmpy) #### Wait, how do I order a Markov Chain? (Part 2) Let's get straight to the meat. We were trying to generate samples from$P(\textbf{X}| \textbf{E} = \textbf{e})$. In our saunter, we noticed that using a Markov chain would be a cool idea. But we don't know yet what transition model the right Markov chain must have. ### Gibbs sampling We are in search of transition probabilities (from one state of the Markov chain to another), that converge to the desired posterior distribution. Gibbs sampling gives us just that. As per our last discussion on the factored state space, the state is now an instantiation to all variables of the model. We'll represent the state as$(\textbf{x}_{-i}, x_{i})$. Consider the kernel$\mathcal{T}_{i}$that gives us the transition in$i^{th}$variable's state: $$\mathcal{T}_{i}((\textbf{x}_{-i}, x_{i}) \rightarrow (\textbf{x}_{-i}, x'_{i})) = P(x'_i | \textbf{x}_{-i})$$ Yep, the transition probability does not depend on the current value$x_i$of$X_i$-- only on the remaining state$\textbf{x}_{-i}$. You can take my word, or check it for yourself, the stationary distribution that this process converges to is$P(\textbf{X}| \textbf{e})$. Now all that's left is computing$P(x'_i | \textbf{x}_{-i})$. That can be done in a pretty neat way, I'll show you how next time! ### Jaakko Leppäkanga(MNE-Python) #### MNE sprint Last week I spent in Paris at the MNE sprint where many of the contributors came together to produce code. It was nice to see the faces behind the github accounts. It was quite an intensive five days of coding. I finalized the ICA source plotter for raw and epochs objects. It turned out quite nice. It is now possible to view interactively the topographies of independent components by clicking on the desired component name. I also got some smaller pull requests merged like the adding of axes parameters to some of the plotting functions, so that the user can plot the figures where ever he/she desires. I also got one day of spare time to see the sights of Paris before flying back to Finland. This week I'll start implementing similar interactive functionalities for TFR plotters. ### Michael Mueller(Astropy) #### Week 8 This past week I've been adding some functionality to the index system while the current PR is being reviewed: taking slices of slices, including non-copy and non-modify modes as context managers, etc. One issue my mentors and I discussed in our last meeting is the fact that Column.__getitem__ becomes incredibly slow if overriden to check for slices and so forth, so we have to do without it (as part of a larger rebase on astropy/master). Our decision was to drop index propagation upon column slicing, and only propagate indices on Table slices; though this behavior is potentially confusing, it will be documented and shouldn't be a big deal. For convenience, a separate method get_item in Column has the same functionality as the previous Column.__getitem__ and can be used instead. I have a lot more to write, but I need to be up early tomorrow morning so I'll finish this post later. ### AMiT Kumar(Sympy) #### GSoC : This week in SymPy #8 Hi there! It's been eight weeks into GSoC . Here is the Progress for this week. ### Progress of Week 8 This week, my PR for making invert_real more robust was Merged, along with these: • PR #9628 : Make invert_real more robust • PR #9668 : Support solving for Dummy symbols in linsolve • PR #9666 : Equate S.Complexes with ComplexPlane(S.Reals*S.Reals) Note: We renamed S.Complex to S.Complexes, which is analogous with S.Reals as suggested by @jksuom. I also opened PR #9671 for Simplifying ComplexPlane output when ProductSet of FiniteSets are given as input: ComplexPlane(FiniteSet(x)*FiniteSet(y)), It was earlier simplified to: ComplexPlane(Lambda((x, y), x + I*y), {x} x {y})  It isn't very useful to represent a point or discrete set of points in ComplexPlane with an expression like above. So in the above PR it is now simplified as FiniteSet of discrete points in ComplexPlane: In [3]: ComplexPlane(FiniteSet(a, b, c)*FiniteSet(x, y, z)) Out[3]: {a + I*x, a + I*y, a + I*z, b + I*x, b + I*y, b + I*z, c + I*x, c + I*y, c + I*z}  It's awaiting Merge, as of now. Now, I have started replacing solve with solveset and linsolve. ### from future import plan Week #9: This week I plan to Merge my pending PR's & work on replacing old solve in the code base with solveset. ###$ git log

PR #9710 : Replace solve with solveset in sympy.stats

PR #9708 : Use solveset instead of solve in sympy.geometry

PR #9671 : Simplify ComplexPlane({x}*{y}) to FiniteSet(x + I*y)

PR #9668 : Support solving for Dummy symbols in linsolve

PR #9666 : Equate S.Complexes with ComplexPlane(S.Reals*S.Reals)

PR #9628 : Make invert_real more robust

PR #9587 : Add Linsolve Docs

PR #9500 : Documenting solveset

That's all for now, looking forward for week #9. :grinning:

## July 20, 2015

### Zubin Mithra(pwntools)

#### Integration tests complete for arm, mips and mipsel + ppc initial commit

This week I worked on getting the integration tests for ARM, MIPS and MIPSel merged in. Additionally I've set up the qemu image for working with powerpc(big endian). The image I'm using can be from here. Additionally, you will need to install openbios from here in order to get the qemu image to work out. The deb files for the same can be found here.

I used "debian_squeeze_powerpc_standard.qcow2" and "openbios-ppc_1.0+svn1060-1_all.deb". The startup command line is as follows.

qemu-system-ppc -hda ./debian_squeeze_powerpc_standard.qcow2 -m 2047 -bios /usr/share/openbios/openbios-ppc -cpu G4 -M mac99 -net user,hostfwd=tcp::10022-:22 -net nic

Note: Do not use ping to test network connectivity. use "apt-get update" or something.
Note 2: To ssh into the image do "ssh root@localhost -p10022".

Looking at gdb in ppc the register layout seems roughly as shown here. I'll be working on finalising the aarch64 integration test and ppc support this week.

## July 18, 2015

### Chienli Ma(Theano)

#### Putting Hand on OpFromGraph

This two week, I start working on OpFromGraph. Which is the second part of the proposal.

Currently, if a FunctionGraph have repeated subgraph, theano will optimize these sub-graphs individually, which is not only a waste of computational resources but a waste of time. If we can extract a common structure in FunctionGraph and make it a Op, we can only optimize the sub-graph of this Op once and reuse it every where. This will speed up the optimization process. And OpFromGraph provides such feature.

To make OpFromGraph works well, it should support GPU and can be optimized. Following feature are expected:

• __eq__() and __hash__()
• connection_pattern() and “infer__shape()“
• Support GPU
• c_code()

I implement two feature in last two week: connection_pattern and infer_shape. I hope I can make OpFromGraph a useful feature at the end of this GSoC :).

### Ambar Mehrotra(ERAS Project)

#### GSoC 2015: 5th Biweekly Report

Hi everyone, there were two major features that I worked on during the past two weeks.

Multiple Attributes per leaf: I gave most of the time to the implementation of this feature during the past two weeks. As I have mentioned in earlier blog posts, leaves represent the data sources, i.e., sensor devices directly interfaced with the Tango Bus. These include servers like Aouda, Health Monitor, etc.
Each of these servers can have multiple attributes. For example, an aouda server keeps track of several things like:
• Air Flow
• Temperature
• Heart Rate
• Oxygen Level
In a similar way, there can be multiple servers having multiple attributes. This feature involved adding support for all the attributes provided by a server. I achieved this by making a specific attribute from a specific server a node instead of the entire server.

Multiple Summaries per branch: As mentioned in previous blog posts, a summary represents the minimum/maximum/average value of the raw data coming in from the various children. This feature aims at adding support for multiple summaries for a branch. For example:
• A user will be asked to name a summary.
• Select nodes from its children that he wants to keep track of in that summary.
• Select the summary function - Minumum/Maximum/Average.
I started implementing this feature in the late part of the previous week and will continue working on this for the coming week. After that I am planning to move on to the implementation of alarms.

Happy Coding.

## Hey, everybody!

It's been a while since I made a big post--$h!t got crazy what with finding a new apartment to call home, packing all of the books I've accumulated over the years, etc. However...it has been an extremely productive last couple of weeks, with a couple bits of code merged into the main project branch, and a few other sitting in PR's. So let's rap. ## Mass of air? Para-what angle?? As mentioned in a previous post, I've been working on some plotting functions for my Google Summer of Code project, part of the Astroplan project. ### Code says what? After testing preliminary code in some IPython notebooks, I transferred this code to .py files. An IPython notebook is essentially a frozen interactive programming session--you can write, modify and run code from a web browser and immediately see any results (i.e., plots, numerical output, etc.), but unlike a normal interactive session in IPython, you can save your code AND output. A file that ends in .py is just code--no output--and so it's more efficient to store bits of code for a complex project in this type of file. Other bits of code, either in IPython notebooks or .py files, can call the code you save in a .py file, which may contain descriptions of classes, functions and other objects (see my post here) So, when you've got very preliminary code, it's nice to have it in IPython notebooks, but once it's (mostly) in working order, you need to transfer it to a .py file and put it in the appropriate place in your copy of the repository you're working on. The usual place to put your Python code is in the source code sub-directory, which tends to have the same name as your project. For instance, our project, Astroplan, has a root directory, astroplan. Inside this directory is another one, also called astroplan, and this is where we put our source code.  Dem files! Most projects will have even more subdirectories, each one containing source code for a particular aspect. In order for all this code to communicate with each other, each source directory (including the main one) has to have an __init__.py file. When a module such as Astropy is being used, __init__.py files help communicate to your Python installation where to look for useful functions, objects, etc., and to make sure there's no confusion about which .py file contains the source code. This is why you can have two modules both containing a function with the exact same name--Python module importing conventions (e.g., "from astropy.coordinates import EarthLocation") plus the __init__.py files make sure everything stays organized. ### The plot increases in viscosity--yet again!* What all the above meant for me was that I had to figure out how all this worked before I could use my newly-minted plotting functions. When you're working on a development copy of a software package, you don't really want to install it in the usual way (if you even have that option). You'll want to do a temporary, "fake" installation of the code you do have so that you can test it (python setup.py build, anyone?). Sometimes this means you'll have to take the extra step of informing your current Python/IPython session where this installation lies. ### Plots or it didn't happen The plotting functions for airmass and parallactic angle went through several iterations, and had to wait for some PR's from Brett to get merged in order to use our project's built-in airmass and parallactic angle functions. My PR containing the plot_airmass and plot_parallactic functions finally merged recently--check it out! It also contains some IPython notebooks examples on the usage of these, which will eventually migrate to our documentation page.  Airmass vs. Time for 3 targets as seen from Subaru Telescope on June 30, 2015.  Parallactic Angle vs. Time for the same three targets. Polaris would make a horrible target. You may notice that the sky plot is missing here--due to technical issues, I moved it to a separate PR. It's unfinished, and hopefully my mentors will have some suggestions *cough, cough* as to how to figure out the funky grid stuff ### Siddharth Bhat(VisPy) #### Math Rambling - Dirac Delta derivative  I’ve been studying Quantum Mechanics from Shankar’s Principles of Quantum Mechanics recently, and came across the derivative of the Dirac delta function that had me stumped.$ \delta'(x - x') = \frac{d}{dx} \delta(x - x') = -\frac{d}{dx'} \delta(x - x') $I understood neither what the formula represented, how the two sides are equal. Thankfully, some Wikipedia and Reddit (specifically /r/math and /u/danielsmw) helped me find the answer. I’m writing this for myself, and so that someone else might find this useful. Terminology I will call$\frac{d}{dx} \delta(x - x')$as the first form, and$-\frac{d}{dx'} \delta(x - x')as the second form Breaking this down into two parts: show what the derivative computes show that both forms are equal 1. Computing the derivative of the Dirac Delta Since the Dirac Delta function can only be sensibly manipulated in an integral, let’s stick the given form into an integral. $$\delta'(x - x') = \frac{d}{dx} \delta(x - x') \\ \int_{-\infty}^{\infty} \delta' (x - x') f(x') dx' \\ = \int_{-\infty}^{\infty} \frac{d}{dx} \delta(x - x') f(x') dx' \\$$ Writing out the derivative explicitly by taking the limit, $$= \int_{-\infty}^{\infty} \lim{h \to 0} \; \frac{\delta(x - x' + h) - \delta(x - x')}{h} f(x') dx' \\ = \lim{h \to 0} \; \frac{ \int_{-\infty}^{\infty} \delta((x + h) - x') f(x') dx' - \int_{-\infty}^{\infty} \delta(x - x') f(x') dx'}{h} \\ = \lim{h \to 0} \; \frac{f(x + h) - f(x)}{h} \\ = f'(x)$$ Writing only the first and last steps, $$\int_{-\infty}^{\infty} \delta' (x - x') f(x') dx' = f'(x)$$ This shows us what the derivative of Dirac delta does. On being multiplied with a function, it “picks” the derivative of the function at one point. 2. Equivalence to the second form We derived the “meaning” of the derivative. Now, it’s time to show that the second form is equivalent to the first form. Take the second form of the delta function as the derivative, $$\delta'(x - x') = - \frac{d}{dx'} \delta(x - x') \\ \int_{-\infty}^{\infty} \delta' (x - x') f(x') dx' \\ = \int_{-\infty}^{\infty} - \frac{d}{dx'} \delta(x - x') f(x') dx' \\$$ Just like the first time, open up the derivative with the limit definition $$= \int_{-\infty}^{\infty} \lim{h \to 0} \; - (\frac{\delta(x - (x' + h)) - \delta(x - x')}{h}) f(x') dx' \\ = \lim{h \to 0} \; \frac{ \int_{-\infty}^{\infty} \delta((x - h) - x') f(x') dx' - \int_{-\infty}^{\infty} \delta(x - x') f(x') dx'}{h} \\ = \lim{h \to 0} \; - \frac{f(x - h) - f(x)}{h} \\ = \lim{h \to 0} \; \frac{f(x) - f(x - h)}{h} \\ = f'(x)$$ Conclusion That shows that the derivate of the Delta Function has two equivalent forms, both of which simply “pick out” the derivative of the function it’s operating on. $$\delta'(x - x') = \frac{d}{dx} \delta(x - x') = -\frac{d}{dx'} \delta(x - x')$$ Writing it with a function to operate on (this is the version I prefer): First form: $$\int_{-\infty}^{\infty} \delta' (x - x') f(x') dx' = \\ \int_{-\infty}^{\infty} \frac{d}{dx}\delta(x - x') f(x') dx' = \\ f'(x)$$ Second form: $$\int_{-\infty}^{\infty} \delta' (x - x') f(x') dx' = \\ \int_{-\infty}^{\infty} -\frac{d}{dx'}\delta(x - x') f(x') dx' = \\ f'(x)$$ A note on notation In a violent disregard for mathematical purity, one can choose to abuse notation and think of the above transformation as - $$\delta'(x - x') = \delta(x - x') \frac{d}{dx}$$ We can write it that way, since one can choose to think that the delta function transforms $$\int_{-\infty}^{\infty} \delta'(x - x')f(x') dx' \to \\ \int_{-\infty}^{\infty} \delta(x - x')\frac{d}{dx}f(x')dx' = \\ \int_{-\infty}^{\infty} \delta(x - x') f'(x') = \\ f'(x)$$ The original forms and the rewritten one are equivalent, although the original is “purer” than the other. Which one to use in is up to you :) So, to wrap it up: $$\delta'(x - x') = \frac{d}{dx} \delta(x - x') = -\frac{d}{dx'} \delta(x - x') = \delta(x - x') \frac{d}{dx}$$  ### Vito Gentile(ERAS Project) #### Enhancement of Kinect integration in V-ERAS: Fourth report This is my fourth report on what I have done for my GSoC project. If you don’t know what it is about and want to find more information, please refer to this page and this blog post. During the past two weeks I have worked mainly on two issues: finalizing a first user’s step estimation algorithm, and supporting data analysis during and after the training session for AMADEE’15. For what about the user’s step estimation, I have implemented this algorithm, which uses skeletal data got by Kinect to estimate user’s rotation and the walked distance every time a new skeletal frame is tracked. Then a Tango change-event on the moves attribute is fired, and any other Tango module can subscribe to this event in order to use this data and implement user’s navigation. This whole idea will be tested by using a module that Siddhant is writing, which will use estimated user’s movements to animate a rover on the (virtual) Mars surface. I have also worked to support a training session for the AMADEE’15 mission, which has taken place in Innsbruck and was organized by the Austrian Mars Society. During this training session, the Italian Mars Society was there to test their V-ERAS system. What I did was, firstly, to configure two Windows 7 machines to be able to execute the new Python-based body tracker. For this purpose we used Team Viewer for remote control of PCs. After that, we noticed a strange issue, which did not allow to use the new body tracker, due to some strange Tango error (we are going to report this to the Tango community). To overcome this annoying and unexpected problem, the old body tracker (written in C# and still available in the ERAS repository) was used. I have also written some scripts to support Yuval Brodsky and the other team members of IMS to evaluate the effects of virtual reality on the neurovestibular system. To do this, I wrote a first script to get the positions of head and torso skeletal joints from the body tracker, and a second script to convert this data in .xlsx format (to be used by Yuval in data analysis). This allowed me to learn how to use openpyxl, a very easy to use and powerful Python module for writing .xlsx files. To get a feel on it, take a look to this sample code: from openpyxl import Workbook wb = Workbook() # grab the active worksheet ws = wb.active # Data can be assigned directly to cells ws['A1'] = 42 # Rows can also be appended ws.append([1, 2, 3]) # Python types will automatically be converted import datetime ws['A2'] = datetime.datetime.now() # Save the file wb.save("sample.xlsx")  The scripts I have written for data analysis are not yet on the repository (we are trying to improve them a little bit), and now I have to find a way to include, in the same .xlsx file generated from Kinect data, also data taken from Oculus Rift. The next step then will be to include also some gesture recognition, in particular the possibility to identify if user’s hands are open or closed. I will keep you updated with the next posts! Ciao! ### Sumith(SymPy) #### GSoC Progress - Week 8 Hello. Short time since my last post. Here's my report since then. ### Progress I have continued my work on the Polynomial wrappers. Constructors from hash_set and Basic have been developed and pushed up. Printing has also been pushed. I'm currently writing tests for both, they'll be ready soon. When hash_set_eq() and hash_set_compare() were developed, we realised that there were many functions in *_eq() and *_compare() form with repeated logic, the idea was to templatize them which Shivam did in his PR #533. Solution to worry of slow compilation was chalked which I wish to try in the coming week, using std::unique_ptr to a hash_set, instead of a straight hash_set. Hence not necessary to know the full definition of hash_set in the header. I've been reading relevant material, known as PIMPL idiom. ### Report WIP * #511 - Polynomial Wrapper ### Targets for Week 9 I wish to develop the Polynomial wrappers further in the following order. • Constructors and basic methods, add, mul, etc, working with proper tests. • Solve the problem of slow compilation times. • As mentioned previously, use standard library alternates to Piranha constructs so that we can have something even when there is no Piranha as dependency. After the institute began, the times have been rough. Hoping everything falls in place. Oh by the way, SymPy will be present (and represented heavily) at PyCon India 2015. We sent in the content and final proposal for review last week. Have a look at the website for our proposal here. That's all this week. sayōnara ### Yue Liu(pwntools) #### GSOC2015 Students coding Week 08 week sync 12 ## Last week: • Update the doctests for ROP module. • Update the doctests for gadgetfinder module. • Using LocalContext to get the binary arch and bits. • Start coding for Aarch64 supported. • Try to do some code optimization.  220462 0.743 0.000 0.760 0.000 :0(isinstance) 102891 0.413 0.000 0.413 0.000 :0(match) 116430/115895 0.347 0.000 0.363 0.000 :0(len) 1119 0.243 0.000 0.487 0.000 :0(filter) 80874 0.243 0.000 0.243 0.000 glob.py:82(<lambda>) 11226 0.117 0.000 0.117 0.000 :0(map) 12488/11920 0.047 0.000 0.050 0.000 :0(hash)  • Fix some bugs in rop module. ## Next week: • Coding for Aarch64. • Optimizing and fix potential bugs. • Add some doctests and pass the example doctests. ## July 15, 2015 ### Chau Dang Nguyen(Core Python) #### Week 7 Hi everyone In the past few week, I had my schedule screwed up so I didn't have a new post. I have made many improvements to the rest module. One of them is adding filtering and pagination. So user can get a filtered list of data they want, for example, issue?where_status=open&where_priority=normal,critical&page_size=50&page_index=0 will return the first 50 issues which have status as "open", and priority as "normal" or "critical". User can also request the server to send pretty output by adding 'pretty=true'. Pretty output will have 4 spaces indent. Another improvement is having a routing decorator. So people can add new function to the tracker easily like Flask @Routing.route("/hello", 'GET') def hello_world(self, input): return 200, 'Hello World' Unit test and basic authentication are also done to provide testing. ### Yask Srivastava(MoinMoin) #### GSoC Updates Lastly I worked on UserSetting. But most of that work had to be reverted after discussions. I was pointed to this issue where it is suggested to merge template with common features. The alternative was to use less mixins. So for eg. form styling with Bootstrap : But this resulted in significant increase in .css file. Lastly I had to resort to editing form macros to use Bootstrap compnents. So exclusive form macros for themes. While this does increase the codebase slightly, their won’t be any performing issue in site loading. But I am not using can’t use Bootstrap nav-tabs, instead I styled it to fit the theme. Here is how it looks: http://i.imgur.com/neyADlO.png"> Previously I had implemented these tabs with Bootstrap components, but that did look like an overplay since I also had to write separate javascript for indicating * symbol to unsaved forms. I also worked on index page to use Bootstrap components (Buttons, Paginations..) . http://i.imgur.com/YJQMH1N.png" title="" > Content inside the footer is now more consistent by a simple trick I learned from css tricks blog Roger occasionally forks my repo to test my work. He noticed a bug irregular header collabse in header in mobile view. I fixed the issue in the last commit. For error validations which looked ugly : http://i.imgur.com/iDnME65.png"> I used HTML 5 validations and pattern maching (for emails, password.. etc). There was also a slight bummer last week. I was using some extension in mercurial which made numerous commits without my consent. Ajitesh suggested to delete the repo and recommit all changes. That took some time, but I took that as an opportunity to write more verbose commit messages ### CheckList: • Fix broken search (Fixed ✓) • Fix footer icons coming almost at the border (Fixed ✓) • Fix alignment of the buttons in modernized forms (Fixed ✓) • Modernized item history still has old tables (I’ll do it today) • Give a border around text input boxes in modernized (✘) • Highlight the content in the modernized theme else it looks too much like basic(✘) Here is the latest commit I pushed. ### Other Updates I was invited to give a speech in Software Development: The Open Source Way in IIIT-Delhi. It was a wonderful experience, I love talking/motivating people to use and contribute opensource softwares. And the response was amazing! Couple of people complimented me personally and requested for link to my slides :) ### Teaching Django in college I love django and I am currently teaching first year students of my college python and Django. This is truely an amazing experience! Again, response is pretty good. His words are so motivating . that i tend to do whatever he tells us .. he told us to start blogging …so here i am .. writing my first blog .“ “This Workshop is mentored by Yask Shirivastava ….he is 1 yr elder to us . But seriously is too good , infact better than the final year students” I think ,I’m particularly doing well in this, but yeah it needs a lot of time and hard work . You may have the best teacher but you only learn when u explore it yourself. “ Well, thats what keeps me motivated. I am trying my best not just to teach them concepts of web development but also to ignite passion among them. I also migrated my blog to Octopress 3.0. Migrating was easy as all my images are are hosted on imgur. Using the script I wrote which uploads screenshot to imgur and copies the url to clipboard. Very very convenient. Check it out: ### AMiT Kumar(Sympy) #### GSoC : This week in SymPy #7 Hi there! It's been seven weeks into GSoC and second half has started now. Here is the Progress so far. ### Progress of Week 7 This week I Opened #9628, which is basically an attempt to make solveset more robust, as I mentioned in my last post. The idea is to tell the user about the domain of solution returned. Now, It makes sure that n is positive, in the following example: In [3]: x = Symbol('x', real=True) In [4]: n = Symbol('n', real=True) In [7]: solveset(Abs(x) - n, x) Out[7]: Intersection([0, oo), {n}) U Intersection((-oo, 0], {-n})  Otherwise it will return an EmptySet() In [6]: solveset(Abs(x) - n, x).subs(n, -1) Out[6]: EmptySet()  Earlier: In [12]: solveset(Abs(x) - n, x) Out[12]: {-n, n}  So, for this to happen, we needed to make changes in the invert_real: if isinstance(f, Abs): g_ys = g_ys - FiniteSet(*[g_y for g_y in g_ys if g_y.is_negative]) return _invert_real(f.args[0], Union(g_ys, imageset(Lambda(n, -n), g_ys)), symbol) Union(imageset(Lambda(n, n), g_ys).intersect(Interval(0, oo)), imageset(Lambda(n, -n), g_ys).intersect(Interval(-oo, 0))), symbol)  So, we applied set operations on the invert to make it return non-EmptySet only when there is a solution. ### Now For more Complex Cases: For the following case: In [14]: invert_real(2**x, 2 - a, x) Out[14]: (x, {log(-a + 2)/log(2)})  For the invert to be real, we must state that a belongs to the Interval (-oo, 2] otherwise it would be complex, but no set operation on {log(-a + 2)/log(2)} can make the interval of a to be in (-oo, 2]. Although, it does returns an EmptySet() on substituting absurd values: In [23]: solveset(2**x + a - 2, x).subs(a, 3) Out[23]: EmptySet()  So, we need not make any changes to the Pow handling in invert_real & It's almost done now, except for a couple of TODO's: • Document new changes • Add More tests Though, I will wait for final thumbs up from @hargup, regarding this. ### from future import plan Week #7: This week I plan to complete PR #9628 & get it Merged & start working on replacing old solve in the code base with solveset. ### git log

Below is the list of other PR's I worked on:

PR #9671 : Simplify ComplexPlane({x}*{y}) to FiniteSet(x + I*y)

PR #9668 : Support solving for Dummy symbols in linsolve

PR #9666 : Equate S.Complexes with ComplexPlane(S.Reals*S.Reals)

PR #9628 : [WIP] Make invert_real more robust

PR #9587 : Add Linsolve Docs

PR #9500 : Documenting solveset

That's all for now, looking forward for week #8. :grinning:

## July 14, 2015

### Nikolay Mayorov(SciPy)

#### Large-Scale Least Squares

I finally made my code available as PR to scipy https://github.com/scipy/scipy/pull/5044 This PR contains all code, but was branched from the previous one and focuses on sparse Jacobian support. In this post I’ll explain the approach I chose to handle large and sparse Jacobian matrices.

Conventional least-squares algorithms require $O(m n)$ memory and $O(m n^2)$ floating point operations per iteration (again $n$ — the number of variables, $m$ — the number of residuals). So on a regular PC it’s possible to solve problems with $n, m \approx 1000$ in a reasonable time, but increasing these numbers by an order or two will cause problems. These limitations are inevitable if working with dense matrices, but if Jacobian matrix of a problem is significantly sparse (has only few non-zero elements), then we can store it as a sparse matrix (eliminating memory issues) and avoid matrix factorizations in algorithms (eliminating time issues). And here I explain how to avoid matrix factorizations and rely only of matrix-vector products.

The crucial part of all non-linear least-squares algorithms is finding (perhaps approximate) solution to linear least squares (it gives $O(m n^2)$ time asymptotics):

$J p \approx -f$.

As a method to solve it I chose LSMR algorithm, which is available in scipy. I haven’t thoroughly investigate this algorithm, but conceptually it can be thought of as a specially preconditioned conjugate gradient method applied to least-squares normal equation, but with better numerical properties. I preferred it over LSQR, because it appeared much more recently and the authors claim that it’s more suitable for least-squares problems (as opposed to system of equations). This LSMR algorithm requires only matrix-vector multiplication in the form $J u$ and $J^T v$.

In large-scale setting both implemented algorithms dogbox and Trust Region Reflective as the first step compute approximate Gauss-Newton solution using LSMR. And then:

• dogbox operates in a usual way, i. e. this large-scale modification was almost for free.
• In Trust Region Reflective I apply the 2-d subspace approach to solve a trust-region problem. This subspace is formed by computed LSMR solution and scaled gradient.

When Jacobian is not provided by a user, we need to estimate it by finite differences. If the number of variables is large, say 100000, this operation becomes very expensive if performed in a standard way. But if Jacobian contains only few non-zero elements in each row (its structure should be provided by a user), then columns can be grouped such that all columns in one group are estimated by a single function evaluation, see “Numerical Optimization”, chapter 8.1. The simplest greedy grouping algorithm I used is described in this paper. Its average performance should be quite good — the number of function evaluations required usually is only slightly higher than the maximum number of non-zero elements in each row. More advanced algorithms consider this problem as a graph-coloring problem, but they come down to simple reordering of columns before applying greedy grouping (so can be perhaps implemented later).

In the next post I will report results of algorithms in sparse large problems.

### Highlights:

• I opened #9639 bringing in the rest of the algorithm for computing Formal Power Series. There are still some un-implemented features. I hope to complete them in this week.

• Few of my PR's got merged this week(#9622, #9615 and #9599). Thanks @jcrist, @aktech and @pbrady.

• Opened #9643 for adding the docs related to Fourier Series.

• Polish #9572 and get it ready to be merged.

• Complete #9639.

• Get docs of Fourier Series merged.

That's it. See you all next week. Happy Coding!

### Sudhanshu Mishra(SymPy)

#### GSoC'15: Fourth biweekly update

During this period we've been able to finish and merge following PRs:

As of now I'm working on reducing autosimplifications based on assumptions from the core.

That's all for now.

## July 13, 2015

### Wei Xue(Scikit-learn)

#### GSoC Week 6/7

In the week 6 and 7, I coded BayesianGaussianMixture for the full covariance type. Now it can run smoothly on synthetic data and old-faithful data. Take a peek on the demo.

from sklearn.mixture.bayesianmixture import BayesianGaussianMixture as BGM
bgm = BGM(n_init=1, n_iter=100, n_components=7, verbose=2, init_params='random',
precision_type='full')
bgm.fit(X)


The demo is to repeat the experiment of PRML, page 480, Figure 10.6. VB on BGMM has shown its capability of inferring the number of components automatically. It has converged in 47 iterations.

The ELBO looks a little weired. It is not always going up. When some clusters disappear, ELBO goes down a little bit, then go up straight. I think it is because the estimation of the parameters is ill-posed when these clusters have data samples less than the number of features.

The BayesianGaussianMixture has much more parameters than GaussianMixture, there are six parameters per each components. I feel it is not easy to control the so many functions and parameters. The initial design of BaseMixture is also not so good. I took a look at bnpy which is a more complicated implementation of VB on various mixture models. Though I don't need to go such complicated implementation, but the decoupling of observation model, i.e. $X$, $\mu$, $\Lambda$, and mixture mode, i.e. $Z$, $\pi$ is quite nice. So I tried to use Mixin class to represent these two models. I split MixtureBase into three abstract classes ObsMixin, HiddenMixin and MixtureBase(ObsMixn, HiddenMixin). I also implemented subclasses for Gaussian Mixture ObsGaussianMixin(ObsMixin), MixtureMixin(HiddenMixin), GaussianMixture(MixtureBase, ObsGaussianMixin, MixtureMixin), but Python does allow me to do this due to there is correct MRO. :-|. I changed them back, but this unsuccessful experiment gives me a nice base class, MixtureBase.

I also tried to use cached_property to store the intermediate variables such as, $\ln \pi$, $\ln \Lambda$, and cholsky decomposed $W-1$, but didn't get much benefits. It is almost the same to save these variables as private attributes into instances.

The numerical issue comes from responsibility is extremely small. When estimating resp * log resp, it gives NAN. I simply avoid computing when resp < 10*EPS. Still, ELBO seems suspicious.

The current implementation of VBGMM in scikit-learn cannot learn the correct parameters on old-faithful data.

VBGMM(alpha=0.0001, covariance_type='full', init_params='wmc',
min_covar=None, n_components=6, n_iter=100, params='wmc',
random_state=None, thresh=None, tol=0.001, verbose=0)


It gives only one components. The weights_ is

 array([  7.31951611e-07,   7.31951611e-07,   7.31951611e-07,
7.31951611e-07,   7.31951611e-07,   9.99996340e-01])


I also implemented DirichletProcessGaussianMixture. But currently it looks the same as BayesianGaussianMixture. Both of them can infer the best number of components. DirichletProcessGaussianMixture took a slightly more iteration than BayesianGaussianMixture. If we infer Dirichlet Process Mixture by Gibbs sampling, we don't need to specify the truncated level, only alpha the concentration parameter is enough. But with variational inference, we still need the give the model the maximal possible number of components, i.e., the truncated level $T$.

### Isuru Fernando(SymPy)

#### GSoC Week 7

This week I worked on the Sage wrappers and Python wrappers. To make it easier to try out symengine, I made changes to the sage wrappers such that if sage does not have symengine_conversions methods, (i.e. sage not updated to the symengine branch) then conversions would be done via python strings. For example, an integer is converted to a Python string and then to a Sage integer. This is slow, but makes it easier to install symengine. You can try it out by downloading cmake-3.2.3.spkg and symengine-0.1.spkg and installing them. (Link to download is .....) To install type

sage -i /path/to/cmake-3.2.3.spkg

sage -i /path/to/symengine-0.1.spkg

Python wrappers included only a small amount of functions from SymEngine. Wrappers were added to functions like log, trigonometric functions, hyperbolic functions and their inverses.

CMake package for Sage is now ready for review, http://trac.sagemath.org/ticket/18078.

SymEngine package for Sage can be found here, https://github.com/isuruf/sage/tree/symengine. A PR would be sent as soon as CMake ticket is positively reviewed.

Next week, testing with Sage, Python docstrings, SymEngine package for Sage are the main things that I have planned for now. Also a PyNumber class to handle python numbers would be started as well.

## July 12, 2015

### Mark Wronkiewicz(MNE-Python)

#### Opening up a can of moths

C-day + 48

After remedying the coil situation (and numerous other bugs) my filtering method finally seems to maybe possibly work. When comparing my method to the proprietary one, the RMS of the error is on average 1000 times less than the magnetometer and gradiometer RMS.

It turns out that many of the problems imitating the proprietary MaxFilter method stemmed from how the geometry of the MEG sensors were defined in my model. Bear with me here, as you have to understand some background about the physical layout of the sensors to comprehend the problem. When measuring brain activity, each sensor takes three measurements: two concerning the gradient of the magnetic field (the gradiometers) and one sensing the absolute magnetic field (a magnetometer). The MEG scanner itself is made up of ~100 of these triplets. The gradiometers and magnetometers are manufactured with different geometries, but they are all similar in that they contain one (or a set) of wire coils (i.e., loops). The signal recorded by these sensors is a result of magnetic field that threads these coil loops and then induces a current within the wire itself, which can then be measured. When modeling this on a computer system, however, that measurement has to be discretized, as we can’t exactly calculate how a magnetic field will influence any given sensor coil. Therefore, we break up the area contained in the coil into a number of “integration points.” Now, instead of integrating across the entire rectangular area enclosed by a coil, we calculate the magnetic field at 9 points within the plane. This allows a computer to estimate the signal any given coil would pick up. For an analogy, imagine you had to measure the air flowing through a window. One practical way might be to buy 5 or 10 flowmetry devices, hang them so they’re evenly distributed over the open area, and model how air was flowing through using those discrete point sensors. Only here, the airflow is a magnetic field and the flow sensors are these extremely expensive and sensitive SQUIDS bathed in liquid helium – other than that, very similar.

The hang-up I’ve been dealing with is largely because there are different ways to define those discrete points for the numerical integration. You can have more or fewer points (trading off accuracy vs. computational cost) and there are certain optimizations for how to place those points. As far as the placement, all points could be evenly spaced with equal weighting, but there are big fat engineering books that recommend more optimal (and uneven) weighting of the points depending on the shape in use. It turns out the proprietary SSS software used one of these optimized arrangements, while MNE-Python uses an evenly distributed and weighted arrangement. Fixing the coil definitions has made my custom implementation much closer to the black box I’m trying to replicate.

In the process I’ve also been forced to learn the dedication it takes to produce high-quality code. Before last week, I felt pretty high and mighty because I was religiously following PEP8 standards and making sure my code had something more than zero documentation. With some light nudging from my mentors, I feel like I’ve made the next solid leap forward; unit tests, markup, extensive references and comments have all been a theme since my last blog post. In the process, it can be frustrating to get that all right, but I’m sure the minor annoyance is a small price to pay to make this esoteric algorithm easier for the poor soul who inherits the SSS workload :)

## July 09, 2015

### Brett Morris(Astropy)

#### astroplan Tutorial 1

I'm long overdue for a post about my Google Summer of Code project with astropy, called astroplan. For background on GSoC and astroplan, see this earlier blog post.

#### Why so silent?

I haven't posted in a while because, well, we've been working on the code! You can see the progress in our GitHub repository, where I've made a few big contributions over the past few weeks in pull requests 11 and 14. Most of the discussion about the development of the core functionality of astroplan is in those pull requests.

#### Quick Tutorial: observation planning basics

Say you're going to observe sometime in the near future, and you need to figure out: the time of sunrise and sunset, the altitude of your target at a particular time from your observatory, and when the target next transits the meridian. Let's use Vega as our target and Mauna Kea as the location of our observatory, and use astroplan to find the answers:

from astropy.coordinates import EarthLocationfrom astropy.time import Timefrom astroplan import Observer, FixedTargetimport astropy.units as u# Initialize Observer object at the location of Keckkeck = EarthLocation.from_geodetic('204d31m18s', '19d49m42s', 4160)obs = Observer(location=keck, timezone='US/Hawaii')# Initialize FixedTarget object for Vega using from_namevega = FixedTarget.from_name('Vega')# Pick the time of our observations in UTCtime = Time('2015-07-09 03:00:00')# Calculate the time Vega rises above 30 degress: next_rise_vega = obs.calc_rise(time, vega, horizon=30*u.deg)print('Vega rises: {0} [ISO] = {1} [JD]'.format(next_rise_vega.iso, next_rise_vega.jd))
The above code returns:
Vega rises: 2015-07-09 05:24:22.732 [ISO] = 2457212.72526 [JD]
The time at next rise is an astropy Time object, so it's easy to convert it to other units. Now let's do the rest of the calculations:
# Calculate time of sunrise, sunsetprevious_sunset = obs.sunset(time, which='previous')next_sunrise = obs.sunrise(time, which='next')print('Previous sunset: {}'.format(previous_sunset.iso))print('Next sunrise: {}'.format(next_sunrise.iso))# Is Vega up at the present time?vega_visible = obs.can_see(time, vega)print('Is Vega up?: {}'.format(vega_visible))# When will Vega next transit the meridian?next_transit = obs.calc_meridian_transit(time, vega, which='next')print("Vega's next transit: {}".format(next_transit.iso))
prints the following:
Previous sunset: 2015-07-08 05:02:09.435Next sunrise: 2015-07-09 15:53:53.525Is Vega up?: TrueVega's next transit: 2015-07-09 09:51:18.800
Now let's say you need a half-night of observations. What are the times of astronomical sunrise/sunset and midnight?
# Sunrise/sunset at astronomical twilight, nearest midnight:set_astro = obs.evening_astronomical(time, which='previous')rise_astro = obs.morning_astronomical(time, which='next')midnight = obs.midnight(time)print('Astronomical sunset: {}'.format(set_astro.iso))print('Astronomical sunrise: {}'.format(rise_astro.iso))print('Midnight: {}'.format(midnight.iso))
which prints:
Astronomical sunset: 2015-07-08 06:29:05.259Astronomical sunrise: 2015-07-09 14:27:05.156Midnight: 2015-07-09 10:27:59.015
You can also view this code in an iPython Notebook here.

#### Quick Update - Sunday, 5 July to Thursday, 9 July

Quick update!

This week, I have:

1) Updated the PR with plot_airmass and plot_parallactic, as well as example notebooks.
2) Made another branch for plot_sky.

### Zubin Mithra(pwntools)

#### Aarch64 SROP support completed

I just added in Aarch64 support for pwntools. There is no sys_sigreturn in Aarch64, instead there is a sys_rt_sigreturn implementation. In a lot of ways writing the SROP frame was similar to my ARM experience; there is a magic flag value(FPSIMD_MAGIC) that needs to be present.
Quick note : When setting up QEMU images for an architecture, do not test network connectivity using ping. ICMP might be disabled.
There is something else that is curious about Aarch64 - regardless of the gcc optimization level, the local variables are allocated first and later on, the return address and the frame pointer are pushed onto the stack. I found this quite interesting, and I don't think I've seen this on any other architectures.

eg:
At the prologue
0x000000000040063c <+0>: sub sp, sp, #0x200
0x0000000000400640 <+4>: stp x29, x30, [sp,#-16]!

and at the epilogue,
0x000000000040067c <+64>: ldp x29, x30, [sp],#16
0x0000000000400680 <+68>: add sp, sp, #0x200
0x0000000000400684 <+72>: ret

For a PoC we can get away with something like this. So we end up overwriting the return address of stub(and not of read_input). The make_space makes sure that the "access_ok" function inside the kernel(it checks if there is a frame that can be accessed from the stack) does not fail. The frame is about 4704 bytes in size ; so when access_ok runs ; we need sp to sp+4704 to be mapped in as valid addresses.

The registers for the SROP frame are named as regs[31] in the kernel source; so I used forktest.c, set the breakpoint at handler, the stack state before the "svc 0x0" and the register state after it and found the offsets.

You can view the PR for the same here.

## July 08, 2015

### Rupak Kumar Das(SunPy)

#### Mid-Term Update

Hi all!

The midterm evaluations are over now. Regarding the project, half of the work is done. I fixed a few bugs with the Cuts plugin which was also modified to include the Slit features. It is nearly complete with only a few more fixes needed. Eric, who maintains Ginga, has partially implemented the Bezier curve but it needs a function to determine the points lying on the curve before the Cuts plugin can use it which is my current focus. Also, I need to figure out how to use OpenCv to save arrays as video in the MultiDim plugin instead of using ‘mencoder’ as it does now but it seems OpenCv has a few problems.

Here’s a useful post on how to determine which points lie on a Bezier Curve.

## July 07, 2015

### Sartaj Singh(SymPy)

#### GSoC: Update Week-6

Midterm evaluations are complete. I got to say Google was fairly quick in mailing the results. It was just after a few minutes after the deadline, I received a mail telling me I had passed. Yay!

Here's my report for week 6.

### Highlights:

#### 1. Formal Power Series:

For the most of the week I worked towards improving the implementation for the second part of the algorithm. I was able to increase the range of admissible functions. For this I had to write a custom solver for solving the RE of hypergeometric type. It's lot faster and better in solving the specific type of RE's this algorithm generates in comparison to just using rsolve for all the cases. However, it still has some issues. It's currently in testing phase and probably will be PR ready by the end of this week.

The code can be found here.

While working on it, I also added some more features to FormalPowerSeries(#9572).

Some working examples. (All the examples were run in isympy)

In [1]: fps(sin(x), x)
Out[1]: x - x**3/6 + x**5/120 + O(x**6)
In [2]: fps(cos(x), x)
Out[2]: 1 - x**2/2 + x**4/24 + O(x**6)
In [3]: fps(exp(acosh(x))
Out[3]: I + x - I*x**2/2 - I*x**4/8 + O(x**6)


#### 2. rsolve:

During testing, I found that rsolve raises exceptions while trying to solve RE's, like (k + 1)*g(k) and (k + 1)*g(k) + (k + 3)*g(k+1) + (k + 5)*g(k+2) rather than simply returning None which it generally does incase it is unable to solve a particular RE. The first and the second RE are formed by functions 1/x and (x**2 + x + 1)/x**3 respectively which can often come up in practice. So, to solve this I opened #9615. It is still under review.

#### 3. Fourier Series:

#9523 introduced SeriesBase class and FourierSeries. Both FormalPowerSeries and FourierSeries are based on SeriesBase. Thanks @flacjacket and @jcrist for reviewing and merging this.

In [1]: f = Piecewise((0, x <= 0), (1, True))
In [2]: fourier_series(f, (x, -pi, pi)
Out[2]: 2*sin(x)/pi + 2*sin(3*x)/(3*pi) + 1/2 + ...


#### 4. Sequences:

While playing around with sequences, I realized periodic sequences can be made more powerful. They can now be used for periodic formulas(#9613).

In [1]: sequence((k, k**2, k**3))
Out[2]: [0, 1, 8, 3, ...]


#### 5. Others:

Well I got tired with FormalPowerSeries(I am just a human), so I took a little detour from my regular project work and opened #9622 and #9626 The first one deals with inconsistent diff of Polys while while the second adds more assumption handler's like is_positive to Min/Max.

• Test and polish hyper2 branch. Complete the algorithm.
• Add sphinx docs for FourierSeries.
• Start thinking on the operations that can be performed on FormalPowerSeries.

That's it. See you all next week. Happy Coding!

### Zubin Mithra(pwntools)

#### Setting up Aarch64 and QEMU

This is a short quick post on how I set up Aarch64 with a NAT connection.
For the most part, the process is similar to what is described here and here. Here is the command line I ended up using to start the VM.

HOST=ubuntu; mac=52:54:00:00:00:00; sshport=22000
sudo qemu-system-aarch64 -machine virt -cpu cortex-a57 -nographic -smp 1 -m 512 \
-global virtio-blk-device.scsi=off -device virtio-scsi-device,id=scsi \
-drive file=ubuntu-core-14.04.1-core-arm64.img,id=coreimg,cache=unsafe,if=none -device scsi-hd,drive=coreimg \
-kernel vmlinuz-3.13.0-55-generic \
-initrd initrd.img-3.13.0-55-generic \
-netdev user,hostfwd=tcp::${sshport}-:22,hostname=$HOST,id=net0 \

### Chienli Ma(Theano)

#### Evaluation Passed and the Next Step: OpFromGraph

Evaluation passed and the next step: OpFromGraph

The PR of function.copy() is ready to merged, only need fred to fix a small bug. And in this Friday I passed the mid-term evaluation. So it’s time to take the next step.

In the original proposal ,the next step is to swap output and updates. After a discussion with Fred, we thought this feature is useless so we skip this and head to the next feature directly – OpFromGraph.

## Goal:

make class OpFromGraph work.

## Big How?

OpFromGraph should init a gof.op that has no difference with other Ops and can be optimized. Otherwise it has no sense.

For this, we need to make it work on GPU, make sure it works with C code and document it. Make sure infer_shape(), grad() work with it. Ideally, make R_op() work too.

## Detailed how.

• Implement __hash__() and __eq__() method so it is a basic
• Implement infer_shape() method so that it’s optimizable
• test if it work with shared variable as input and if not make it work. Add test for that.
• Move it correctly to the GPU. We can do it quickly for the old back-end, move all float32 inputs to the GPU. Otherwise, we need to compile the inner function, see which inputs get moved to the GPU, then create a new OpFromGraph with the corresponding input to the GPU. #2982
• Makergrad() work. This should remove the grad_depth parameter

### First Step: infer_shape:

The main idea is to calculatet the shapes of outputs from given input shapes. This is a process similar to executing a function – we cannot know the shape of a variable before knowing the shape of the variables it depends on. So, we can mimic the make_thunk() method to calculate the shape from output to input. I come out with a draft now, and need some help with test case.

### Zubin Mithra(pwntools)

#### Tests for AMD64 and aarch64

This week I've been working on adding an integration test into srop.py for AMD64. You can see the merged PR here. Writing an integration test involves writing mako templates for read and sigreturn.
I've also been working on setting up an AARCH64 qemu machine with proper networking settings.

Next week, I'll be working on getting AARCH64 merged in along with its doctest, and the rest of the integration tests.

### Isuru Fernando(SymPy)

#### GSoC Week 6

This week, I worked on improving the testing and making Sage wrappers. First, building with Clang had several issues and they were not tested. One issue was a clang bug when -ffast-math optimization is used. This flag would make floating point arithmetic perform better, but it may do arithmetic not allowed by the IEEE floating point standard. Since it performs faster we have enabled it in Release mode and due to a bug in clang, a compiler error is given saying  error: unknown type name '__extern_always_inline' . This was fixed by first checking if the error is there in cmake and then adding a flag D__extern_always_inline=inline. Another issue was that type_traits header was not found. This was fixed by upgrading the standard C++ library, libstdc++

This week, I finished the wrappers for Sage. Now converters to and from sage can be found at sage.symbolic.symengine. For this module to convert using the C++ level members, symengine_wrapper.pyx 's definitions of the classes were taken out and declared in symengine_wrapper.pxd and implemented in pyx file. To install symengine in sage, https://github.com/sympy/symengine/issues/474 has to be resolved. A cmake check will be added to find whether this issue exists and if so, then the flag -Wa,-q will be added to the list of flags. We have to make a release of symengine if we were to make spkg's to install symengine in Sage, so some of my time next week will involve getting symengine ready for a release and then making spkgs for everyone to try out symengine.

## July 03, 2015

### Lucas van Dijk(VisPy)

#### GSoC 2015: Midterm summary

Hi all!

It's midterm time! And therefore it is time for a summary. What did I learn these past few weeks, and what were the main road blocks?

## What I learned

This is my first project where I use OpenGL, and a lot has become clearer how this system works: the pipeline, the individual shaders and GLSL, and how they're used for drawing 2D and 3D shapes. Of course, I've only scratched the surface right now, but this is a very good basis for more advances techniques.

I've learned about some mathematical techniques for drawing 2D sprites

A bit more Git experience in a situation where I'm not the only developer of the repository.

This has been a great experience, and the core developers of Vispy are very active and responsive.

## Challenges

I was a bit fooled by my almost lecture free college schedule in May/June, but the final personal assignments where a bit tougher and bigger than expected. So combining GSoC and all these study assignments was sometimes quite a challenge. But the college year is almost over, and after next week I can focus 100% on the GSoC.

In terms of code: I don't think I've encountered real big roadblocks, it took maybe a bit more time before every piece of a lot of shader code fell together, but I think I'm starting to get a good understanding of both the Vispy architecture and OpenGL.

## Past week

The past week I've trying to flesh out the requirements for the network API a bit, and I've also been investigating the required changes for the arrow head visual, because there's a scenegraph and visual system overhaul coming: https://github.com/vispy/vispy/pull/928.

Until next time!

### Abraham de Jesus Escalante Avalos(SciPy)

#### Mid-term summary

Hello all,

We're reaching the halfway mark for the GSoC and it's been a great journey so far.

I have had some off court issues. I was hesitant to write about them because I don't want my blog to turn into me ranting and complaining but I have decided to briefly mention them in this occasion because they are relevant and at this point they are all but overcome.

Long story short, I was denied the scholarship that I needed to be able to go to Sheffield so I had to start looking for financing options from scratch. Almost at the same time I was offered a place at the University of Toronto (which was originally my first choice). The reason why this is relevant to the GSoC is because it coincided with the beginning of the program so I was forced to cope with not just the summer of code but also with searching/applying for funding and paperwork for the U of T which combined to make for a lot of work and a tough first month.

I will be honest and say that I got a little worried at around week 3 and week 4 because things didn't seem to be going the way I had foreseen in my proposal to the GSoC. In my previous post I wrote about how I had to make a change to my approach and I knew I had to commit to it so it would eventually pay off.

At this point I am feeling pretty good with the way the project is shaping up. As I mentioned, I had to make some changes, but out of about 40 open issues, now only 23 remain, I have lined up PRs for another 8 and I have started discussion (either with the community or with my mentor) on almost all that remain, including some of the longer ones like NaN handling which will span over the entire scipy.stats module and is likely to become a long term community effort depending on what road Numpy and Pandas take on this matter in the future.

I am happy to look at the things that are still left and find that I at least have a decent idea of what I must do. This was definitely not the case three or four weeks ago and I'm glad with the decision that I made when choosing a community and a project. My mentor is always willing to help me understand unknown concepts and point me in the right direction so that I can learn for myself and the community is engaging and active which helps me keep things going.

My girlfriend, Hélène has also played a major role in helping me keep my motivation when it seems like things amount to more than I can handle.

I realise that this blog (since the first post) has been a lot more about my personal journey than technical details about the project. I do apologise if this is not what you expect but I reckon that this makes it easier to appreciate for a reader who is not familiarised with 'scipy.stats', and if you are familiarised you probably follow the issues or the developer's mailing list (where I post a weekly update) so technical details would be redundant to you.  I also think that the setup of the project, which revolves around solving many issues makes it too difficult to write about specific details without branching into too many tangents for a reader to enjoy.

If you would like to know more about the technical aspect of the project you can look at the PRs, contact me directly (via a comment here or the SciPy community) or even better, download SciPy and play around with it. If you find something wrong with the statistics module, chances are it's my fault, feel free to let me know. If you like it, you can thank guys like Ralf Gommers (my mentor), Evgeni Burovski and Josef Perktold (to name just a few of the most active members in 'scipy.stats') for their hard work and support to the community.

I encourage anyone who is interested enough to go here to see my proposal or go here to see currently open tasks to find out more about the project. I will be happy to fill you in on the details if you reach me personally.

Sincerely,
Abraham.

### Yue Liu(pwntools)

#### GSOC2015 Students coding Week 06

week sync 10

## Last week:

• issues #37 set ESP/RSP fixed, and a simple implementation for migrate method.
• All testcases in issues #38 passed.
• All testcases in issues #39 passed.
• All testcases in issues #36 passed, but need more testcases.

## Next week:

• Optimizing and fix potential bugs.
• Add some doctests and pass the example doctests.

### Keerthan Jaic(MyHDL)

#### GSoC Midterm Summary

So far, I’ve fixed a release blocking bug, updated the documentation and revamped the core tests. Most of my pull requests have been merged into master. I’ve also worked on refactoring some of the core decorators and improving the conversion tests. However, these are not yet ready to be merged.

In the second period, I will focus on improving the conversion modules. More details can be found in my proposal.

## July 02, 2015

### Manuel Paz Arribas(Astropy)

#### Mid-term summary

Mid-term has arrived and quite some work has been done for Gammapy, especially in the observation, dataset and background modules. At the same time I have learnt a lot about Gammapy, Astropy (especially tables, quantities, angles, times and fits files handling), and python (especially numpy and matplotlib.pyplot). But the most useful thing I'm learning is to produce good code via code reviews. The code review process is sometimes hard and frustrating, but very necessary in order to produce clear code that can be read and used by others.

The last week I have been working on a method to filter observations tables as the one presented in the figure on the first report. The method is intended to be used to select observations according to different criteria (for instance data quality, or within a certain region in the sky) that should be used for a particular analysis.

In the case of background modeling this is important to separate observations taken on or close to known sources or far from them. In addition, the observations can be grouped according to similar observation conditions. For instance observations taken under a similar zenith angle. This parameter is very important in gamma-ray observations.

The zenith angle of the telescopes is defined as the angle between the vertical (zenith) and the direction where the telescopes are pointing. The smaller the zenith angle is, the more vertical the telescopes are pointing, and the thinner is the atmosphere layer. This has large consequences in the amount and properties of the gamma-rays detected by the telescopes. Gamma-rays interact in the upper atmosphere and produce Cherenkov light, which is detected by the telescopes. The amount of light produced is directly proportional to the energy of the gamma-ray. In addition, the light is emitted in a narrow cone along the direction of the gamma-ray.

At lower zenith angles the Cherenkov light has to travel a smaller distance through the atmosphere, so there is less absorption. This means that lower energy gamma-rays can be detected.

At higher zenith angles the Cherenkov light of low-energy gamma-rays is totally absorbed, but the Cherenkov light cones of the high-energy ones are longer, and hence the section of ground covered is larger, so particles that fall further away from the telescopes can be detected, increasing the amount of detected high-energy gamma-rays.

The zenith angle is maybe the most important parameter, when grouping the observations in order to produce models of the background.

The method implemented can filter the observations according to this (and other) parameters. An example using a dummy observation table generated with the tool presented on the first report is presented here (please click on the picture for an enlarged view):
Please notice that instead of the mentioned zenith angle, altitude as the zenith's complementary angle (altitude_angle = 90 deg - zenith_angle) is used.
In this case, the first table was generated with random altitude angles between 45 deg and 90 deg (or 0 deg to 45 deg in zenith), while the second table is filtered to keep only zenith angles in the range of 20 deg to 30 deg (or 60 deg to 70 deg in altitude).

The tool can be used to apply selections in any variable present in the observation table. In addition, an 'inverted' flag has been programmed in order o be able to apply the filter to keep the values outside the selection range, instead of inside.

Recapitulating the progress done until now, the next steps will be to finish the tools that I am implementing now: the filter observations method described before and the background cube model class on the previous report. In both cases there is still some work to do: an inline application for filtering observations and more methods to create cube background models.

The big milestone is to have a working chain to produce cube background models from existing event lists within a couple of weeks.

### Vito Gentile(ERAS Project)

#### Enhancement of Kinect integration in V-ERAS: Mid-term summary

This is my third report on what I have done for my GSoC project. If you don’t know what it is about and want to find more information, please refer to this page and this blog post.

In this report, I will summarize what I have done until now, and also describe what I will do during the next weeks.

My project is about the enhancement of Kinect integration in V-ERAS, which was all based on C#, in order to use the official Microsoft API (SDK version: 1.8). However, the whole ERAS source code is mainly written in Python, so the first step was to port the C# body tracker in Python, by using PyKinect. This also required the rewrite of all the GUI (by using PGU).

Then, I have also integrated the height estimation of the user in the body tracker, by using skeletal information for calculating it. This has been implemented as a Tango command, so that it can be executed by any device connected to the Tango bus. This feature will be very useful to modulate the avatar size before starting simulation in V-ERAS.

I have also took a look to the webplotter module, which will be useful for the incoming AMADEE mission, to verify the effect of virtual reality interaction on user’s movements. What I have done is to edit the server.py script, which was not able to manage numpy arrays. These structures are used by PyTango for attributes defined as “SPECTRUM”; in order to correctly save user’s data in JSON, I had to add a custom JSON encoder (see this commit for more information).

What I am starting to do now is perhaps the most significant part of my project, which is the implementation of user’s step estimation. At the moment, this feature is integrated in the V-ERAS Blender repository, as a Python Blender script. The idea now is to change the architecture to be event-based: everytime a Kinect frame with skeletal data is read by the body tracker, it will calculate user’s movements in terms of body orientation and linear distance, and  I will push a new change event. This event will be read by a new module, that is being developed by Siddhant (another student which is participating to GSoC 2015 with IMS and PSF), to move a virtual rover (or any other humanoid avatar) according to user’s movements.

I have started to developing the event-based architecture, and what I will start to do in these days is to integrate the step estimation algorithm, starting from the one that is currently implemented in V-ERAS Blender. Then I will improve it, in particular for what about the linear distance estimation; the body orientation is quite well calculated with the current algortihm indeed, so although I will check its validity, hopefully it will be simply used as it is now.

The last stage of my project will be to implement gesture recognition, in particular the possibility to recognize if user’s hands are closed or not. In these days I had to implement this feature in C# for a project that I am developing for my PhD research. With Microsoft Kinect SDK 1.8, it is possible by using KinectInteraction, but I am still not sure about the feasibility of this feature with only PyKinect (which is a sort of binding of the C++ Microsoft API). I will discover more about this matter in the next weeks.

I will let you know every progress with the next updates!

Stay tunes!

### Shridhar Mishra(ERAS Project)

#### Mid - Term Post.

Now that my exams are over i can work with full efficiency and work on the project.
the current status of my project looks something like this.

Things done:

• Planner in place.
• Basic documentation update of europa internal working.
• scraped pygame simulation of europa.

Things i am working on right now:
• Integrating Siddhant's battery level indicator from Husky rover diagnostics with the planner for more realistic model.
• Fetching things and posting things on PyTango server. (Yet to bring it to a satisfactory level of working)
Things planned for future:
• Integrate more devices.
• improve docs.

### Ambar Mehrotra(ERAS Project)

#### GSoC 2015: Mid-Term and 4th Biweekly Report

Google Summer of Code 2015 started on May 25th and the midterm is already here. I am glad to note that my progress has been in accordance with the timeline I had initially provided. This includes all the work that I had mentioned till the last blog post in this series as well as the work done during the previous week.

During the past week I was busy working on the Data Aggregation and Summary Creation for various branches in the tree. Basic structure and functionality of the tree is as follows:
• The tree can have several nodes inside it.
• Each node can either be a branch(can have more branches or leaves as children) or a leaf(cannot have any children).
• Each node has its raw data and a summary.
• The raw data for a leaf node is the data coming in directly from the device servers, while the raw data for branches is the summary of individual nodes.
• The summary for a leaf node can be defined as the minimum/maximum./average value of the sensor readings over a period of time. Later the user can create a custom function for defining the summary.
• The summary for a branch is the minimum/maximum/average value of its children.
Implementation:

For summarizing information over time at different levels of hierarchy it was necessary to keep logging the data coming in from the device servers. I decided to go with MongoDB as a json style database seemed like the best option for storing and retrieving data for different levels of hierarchy and mongodb is quite popular for doing such tasks.

I started a thread as soon as the user created the summary for a data source which polls the device server at regular intervals and logs the data in the mongodb database. Similar threads were created for each level of hierarchy where each node has the information about its raw data and summary and knows its immediate children. This kind of structure simplified the process of managing the hierarchy at different levels.

When the user clicks a node its information - raw data and summary, are shown on the right panel in different tabs. The user has the option of modifying the summary as well if he wants to do so. Here is a screenshot for the raw data:

In the upcoming weeks and the later part of the program, I am planning to work on various bug fixes, implementation of functionality for multiple attributes from a device server and integration with the Tango Alarm Systems and monitoring alarms.

Happy Coding!

### Jakob de Maeyer(ScrapingHub)

Previously, I introduced the concept of Scrapy add-ons and how it will improve the experience of both users and developers. Users will have a single entry-point to enabling and configuring add-ons without being required to learn about Scrapy’s internal settings structure. Developers will gain better control over enforcing and checking proper configuration of their Scrapy extensions. Additional to their extension, they can provide a Scrapy add-on. An add-on is any Python object that provides the add-on interface. The interface, in turn, consists of few descriptive variables (name, version, …) and two callbacks: One for enforcing configuration, called before the initialisation of Scrapy’s crawler, and one for post-init checks, called immediately before crawling begins. This post describes the current state of and issues with the implementation of add-on management in Scrapy.

### Current state

The pull request with the current work-in-progress on the implementation can be found on GitHub. Besides a lot of infrastructure (base classes, interfaces, helper functions, tests), its heart is the AddonManager. The add-on manager ‘holds’ all loaded add-ons and has methods to load configuration files, add add-ons, and check dependency issues. Furthermore, it is the entry point for calling the add-ons’ callbacks. The ‘loading’ and ‘holding’ part can be used independently of one another, but in my eyes there are too many cross-dependencies for the ‘normal’ intended usage to justify separating them into two classes.

### Two “single” entry points?

From a user’s perspective, Scrapy settings are controlled from two configuration files: scrapy.cfg and settings.py. This distinction is not some historical-backwards-compatible leftover, but has a sensible reason: Scrapy uses projects as organisational structure. All spiders, extensions, declarations of what can be scraped, etc. live in a Scrapy project. Every project has settings.py in which crawling-related settings are stored. However, there are other settings that can or should not live in settings.py. This (obviously) includes the path to settings.py (for ease of understanding, I will always write settings.py for the settings module, although it can be any Python module), and settings that are not bound to a particular project. Most prominently, Scrapyd, an application for deploying and running Scrapy spiders, uses scrapy.cfg to store information on deployment targets (i.e. the address and auth info for the server you want to deploy your Scrapy spiders to).

Now, add-ons are bound to a project as much as crawling settings are. Consequentially, add-on configuration should therefore live in settings.py. However, Python is a programming language, and not a standard for configuration files, and its syntax is therefore (for the purpose of configuration) less user-friendly. An ini configuration like this:

# In scrapy.cfg

database = some.server
user = some_user


would (could) look similar to this in Python syntax:

# In settings.py

_name = 'path.to.mysql_pipe',
database = 'some.server',
user = 'some_user',
)


While I much prefer the first version, putting add-on configuration into scrapy.cfg would be very inconsistent with the previous distinction of the two configuration files. It will therefore probably end up in settings.py. The syntax is a little less user-friendly, but after all, most Scrapy users should be familiar with Python. For now, I have decided to write code that reads from both.

In some cases, it might be helpful if add-ons were allowed to load and configure other add-ons. For example, there might be ‘umbrella add-ons’ that decide what subordinate add-ons need to be enabled and configured given some configuration values. Or an add-on might depend on some other add-on being configured in a specific way. The big issue with this is that, with the current implementation, the first time the methods of an add-ons are called is during the first round of callbacks to update_settings(). Should an add-on load or reconfigure another add-on here, other add-ons might already have been called. While it is possible to ensure that the update_settings() method of the newly added add-on is called, there is no guarantee (and in fact, it is quite unlikely) that all add-ons see the same add-on configuration in their update_settings().

I see three possible approaches to this:

1. Forbid add-ons from loading or configuring other add-ons. In this case ‘umbrella add-ons’ would not be possible and all cross-configuration dependencies would again be burdened onto the user.
2. Forbid add-ons to do any kind of settings introspection in update_settings(), instead only allow them to do changes to the settings object or load other add-ons. In this case, configuring already enabled add-ons should be avoided, as there is no guarantee that their update_settings() method has not already been called
3. Add a third callback, update_addons(config, addonmgr), to the add-on interface. Only loading and configuring other add-ons should be done in this method. While it may be allowed, developers should be aware that depending on the config (of their own add-on, i.e. the one whose update_addons() is currently called) is fragile as, once again, there is no guarantee in which order add-ons will be called back.

I have put too much thought into it just yet, but I think I prefer option 3.

### Julio Ernesto Villalon Reina(Dipy)

Midterm Summary

So, the first part of GSoC is over and the first midterm is due today. Here is a summary of this period.

The main goal of the project is to implement a segmentation program that is able to estimate the Partial Volume (PV) between the three main tissue types of the brain (i.e. white matter, cerebrospinal fluid (CSF) and grey matter). The input to the algorithm is a T1-weighted Magnetic Resonance Image (MRI) of the brain.
I checked back on what I have worked on so far and these are my two big accomplishments:

- The Iterated Conditional Modes (ICM) for the Maximum a Posteriori - Markov Random Field (MAP-MRF) Segmentation. This part of the algorithm is at the core of the segmentation as it minimizes the posterior energy of each voxel given its neighborhood, which is equivalent to estimating the MAP.
- The Expectation Maximization (EM) algorithm in order to update the tissue/label parameters (mean and variance of each label). This technique is used because this is an “incomplete” problem, since we know the probability distribution of the tissue intensities but don’t know how each one contributes to it.

By combining these two powerful algorithms I was able to obtain 1) the segmented brain into three classes and 2) the PV estimates (PVE) for each tissue type. The images below show an example coronal slice of a normal brain and its corresponding outputs.

What comes next? Tests, tests, tests…. Since I have the segmentation algorithm already up and running I have to do many tests for input parameters such as the number of iterations to update the parameters with the EM algorithm, the beta value, which determines the importance of the neighborhood voxels, and the size of the neighborhood. Validation scripts must be implemented as well to compare the resulting segmentation with publicly available programs. These validation scripts will initially compute measures such as Dice and Jaccard coefficients to verify how close my method’s results are to the others.

For an updated version of the code and all its development since my first pull request please go to:

https://github.com/nipy/dipy/pull/670#partial-pull-merging

 T1 original image
 Segmented image. Red: white matter, Yellow: grey matter,Light Blue: corticospinal fluid
 Corticospinal fluid PVE
 Grey matter PVE

 White matter PVE

#### Quick Update - Wednesday, 1 July 2015

Quick update!

Today, I:

1)  Made a PR for plot_airmass and plot_parallactic, as well as some example notebooks for their use.

### Abhijeet Kislay(pgmpy)

#### Mid-Term Summary

The Mid-term is over. I am through the first half of the Google summer of Code! As far as the accomplishments are concerned, I have almost implemented the culmination of 3 papers in python and have kept on updating my pull request here: pull 420 . My most efforts went on getting all the algorithm work so […]

### Aron Barreira Bordin(Kivy)

#### Mid-Term Summary

Hi!

We are at the middle of the program, so let's get an overview of my proposal, what I've done, what I'll be doing in the second part of the project. I'll also post about my experience until now, good and bad aspects, and how I'll work to do a good job.

### Project Development

I'm really happy to be able to work with extra features not listed in my proposal. As long as I did a good advance in my proposal, I worked in some interesting and important improvements to Kivy Designer. In the second part of the program, I'll finish to code my proposal and try to add as many new features/bug fixes as possible.

### Blockers

Unfortunately, my University has a different calendar this year, I'll have classes until August 31 ;/, so I'm really sad to not be able to work with full time in my project. Sometimes I need to divide my study/work time. As I wrote above, I'm really happy to have a good progress, but I'd love to be able to do even more.

### Second period

In this second period, I'll try to focus my development to be able to release a more stable version of Kivy Designer. Right now Kivy Designer is an alpha tool, and actually, Isn't a nice tool to use. But by the end of the project, my goal is to invert this point of view. To improve the stability project, I'd like to add Unit Tests and documentation to the project.

Thx,

Aron Bordin.

## July 01, 2015

### Sartaj Singh(SymPy)

#### GSoC: Update Week-5

Midterm evaluation has started and is scheduled to end by 3rd of July. So, far GSoC has been a very good experience and hopefully the next half would even be better.

Yesterday, I had a meeting with my mentor @jcrist. It was decided that we will meet every Tuesday on gitter at 7:30 P.M IST. We discussed my next steps in implementing the algorithm and completing FormalPowerSeries.

### Highlights:

• Most of my time was spent on writing a rough implementation for the second part of the algorithm. Currently it is able to compute series for some functions. But fails for a lot of them. Some early testing indicates, this maybe due to rsolve not being able to solve some type of recurrence equations.

• FourierSeries and FormalPowerSeries no longer computes the series of a function. Computation is performed inside fourier_series and fps functions respectively. Both the classes are now used for representing the series only.

• I decided it was time to add sphinx documentation for sequences. So, I opened #9590. Probably, it will be best to add documentation at the same time as the implementation from next time.

• Also opened #9599 that allows Idx instances to be used as limits in both Sum and sequence.

• #9523's review is mostly done and should get merged soon. Also, add the documentation for Fourier series.

• Polish #9572 and make it more robust.

• Improve the range of functions for which series can be computed. Probably will need to improve the algorithm for solving the recurrence equations.

This week is going to be fun. Lots to do :)

#### Quick Update - Monday, 29 June and Tuesday, 30 June 2015

Quick update!

The last two days, I:
1) Updated plots.py to reflect the updated core.py functions.
2) Updated example notebooks to include those Astroplan objects/functions.

### Richard Plangger(PyPy)

#### GSOC Mid Term

Now the first half of the GSoC program 2015 is over and it has been a great time. I compared the time line just recently and I have almost finished all the work that needed to be done for the whole proposal. Here is a list what I have implemented.
• The tracing intermediate representation has now operations for SIMD instructions (named vec_XYZ) and vector variables
• The proposed algorithm was implemented in the optimization backend of the JIT compiler
• Guard strength reduction that handles arithmetic arguments.
• Routines to build a dependency graph and reschedule the trace
• Extended the backend to emit SSE4.1 SIMD instructions for the new tracing operations
• Ran some benchmark programs and evaluated the current gain
I even extended the algorithm to be able handle simple reduction patterns. I did not include this in my proposal. This means that numpy.sum or numpy.prod can be executed with SIMD instructions.

Here is a preview of trace loop speedup the optimization currently achieves.

Note that the setup for all programs is the following: Create two vectors (or one for the last three) (10.000 elements) and execute the operation (e.g. multiply) on the specified datatype. It would look something similar to:

a = np.zeros(10000,dtype='float')
b = np.ones(10000,dtype='float')
np.multiply(a,b,out=a)

After about 1000 iterations of multiplying the tracing JIT records and optimizes the trace. Before jumping to and after exiting the trace the time is recorded. The difference you see in the plot above. Note there is still a problem with any/all and that this is only a micro benchmark. It does not necessarily tell anything about the whole runtime of the program.

For multiply-float64 the theoretical maximum speedup is nearly achieved!

Expectations for the next two months

One thing I'm looking forward to is the Python conference in Bilbao. I have not met my mentors and other developers yet. This will be awesome!
I have also been promised that we will take a look at the optimization so that I can further improve the optimization.

To get even better result I will also need to restructure some parts in the Micro-NumPy library in PyPy.
I think I'm quite close to the end of the implementation (because I started in February already) and I expect that the rest of the GSoC program I will extend, test, polish, restructure and benchmark the optimization.

### Prakhar Joshi(Plone)

#### The Transform Finally!!

Hello everyone, today I will tell you how I implemented the safe_html transform using lxml library of python. I tried to port the safe_html form CMFDefault dependencies to lxml libraries and tried to install my new transform in place of the old safe_html transform. So when ever our add-on is installed then it will uninstall the safe_html and install our new transoform. So there are a lot of things going on in the mind about lxml, why we use that and all.. So lets explore all these things.

What is lxml ?

The lxml XML toolkit is a Pythonic binding for the C libraries libxml2 and libxslt. It is unique in that it combines the speed and XML feature completeness of these libraries with the simplicity of a native Python API, mostly compatible but superior to the well-known ElementTree API.

Why we need to port the transform to lxml ?

Earlier the safe_html transform had the dependencies of CMFDefault and we all are working to make plone free from CMFDefault dependencies as due to CMFDefault dependencies the transform was slow and also the code base for safe_html was old and needs to be updates or we can say it needs to be changed. So as we have seen lxml is fast so we choose that for our transform.

How to implement our transform using lxml ?

Till now its all good that we have decided what to use to remove CMFDefault dependencies. But now main thing is how to implement the lxml for our new transoform so that it functions same as the previous old safe_html transform. So for that I have to dig the lxml libraries and find out the modules that are useful for our transform. So I found out that we have use the cleaner class of lxml package. This class have several functions like "__init__" and "__call__". So I inherited the cleaner class into my HTMLParser class and overwrite the "__call__" function according to the requirements of our transforms.

Also I created a new function named "fragment_fromString()" which return the string by removing the nasty tags or element from it. Here is the snippet for the function :-

def fragment_fromstring(html, create_parent=False, parser=None, base_url=None, **kw):
if not isinstance(html, _strings):
raise TypeError('string required')
elements = fragments_fromstring(html, parser=parser,
base_url=base_url, **kw)
if not elements:
raise etree.ParserError('No elements found')
temp2 = []
if len(elements) > 1:
for i in range(len(elements)):
result = elements[i]
if result.tail and result.tail.strip():
raise etree.ParserError('Element followed by text: %r' % result.tail)
result.tail = None
temp2.append(result)
else:
result = elements[0]
if result.tail and result.tail.strip():
raise etree.ParserError('Element followed by text: %r' % result.tail)
result.tail = None
temp2.append(result)
return temp2

After that I created the main class for our transform named SafeHTML and in that class I defined the pre configured transform status as in the nasty tags and valid tags for the transform initially.

After that the transform is created that it will take the data as a stream and will give out data also as a stream. We created a data object of IDataStream class.
Now after that the convert function will take data as input and will do the operations as required as if the user give the input of nasty tags and valid tags it will filter the input html accordingly or if user doesn't give the input then it will take the default configuration of the transform and will do operations accordingly.

After writing that transform I test that transform with a lot of html inputs and checked their outputs also. They were all as required. There we go, tests cases were passing and the safe_html transform script we created was working perfectly. So the last thing that was left was to register our transform and remove old safe_html transform of PortalTransform.

Register new transform and remove old safe_html transform on add-on installation..

As of new the transform is ready and new have to integrate with plone. For that we have to modify the setuphandlers.py file as in that file we have our add-on configuration after add-on installation. We have function class "post_install" so we will configure our transform and remove the old safe_html transform on post_installation of our add-on.

There are 2 things that have to be done on the add-on installation :-
1) The old safe_html of PortalTransform have to be uninstalled/unregistered.
2) The new transform that we have created above named "exp_safe_html" have to installed.

So for uninstalling the old transform we will unregister the transform with name by using the transformEngine of PortalTransform. We will get the transform name by "getToolByName(context, 'portal_transforms')" this will give us all the transform of the portal_transforms and we will just uninstall the tranfrom with name safe_html. For confirming that we will use the logger message which will say "safe_html transform un registered" .

After unregistering the old safe_html its time to register our new exp_safe_html transform. For that we will use pkgutil to get the module where we have our new transform and we will register our new transform using getToolByName(context, 'portal_transforms') so by using TranfromEngine of portal Transform we will be able to register our new transform for our new add-on and put the logger message on successful registration of new transform.

Finally when I ran the test cases after implementing these things, I saw the logger message  as "UnRegistering the Safe_html" and then next message is "Registering exp_safe_html".

Yayaya!! Finally able to register my new transform and unregister the old transform.

I tried to make you understand the code as much as possible but most part of it was coding so it better to see the code for the same as it will be more clear form the code as it quite impossible to tell all the minute things done in code to be detailed here. Hope you will understand.

Cheers!!

### Nikolay Mayorov(SciPy)

#### 2D Subspace Trust-Region Method

Trust-region type optimization algorithms solve the following quadratic minimization problem at each iteration:

$\displaystyle \min m(p) = \frac{1}{2} p^T B p + g^T p, \text { s. t. } \lVert p \rVert \leq \Delta$

If such problem is too big to solve, the following popular approach can be used.Select two vectors and put them in $n \times 2$ matrix $S$. One of these vectors is usually gradient $g$, another is unconstrained minimizer of quadratic function (in case of $B$ is positive definite) or the direction of negative curvature otherwise. Then it’s helpful to make vectors orthogonal to each other and of unit norm (apply QR to $S$). Now let’s define $p$ to lie in subspace spanned by these two vectors $p = S q$, substituting in the original problem we get:

$\displaystyle \min m'(q) = \frac{1}{2} q^T B' q + g'^T q, \text { s. t. } \lVert q \rVert \leq \Delta$,

where $B' = S^T B S$ is $2 \times 2$ matrix and $g' = S^T g$.

The problem becomes very small and supposably easy to solve. But still we need to find its accurate solution somehow. The appealing approach which is often mentioned in books without details is to reduce the problem to the fourth-order algebraic equation. Let’s find out how to actually do that. As I mentioned in the previous posts there are two main cases: a) $B'$ is positive definite and $-B'^{-1} g'$ lies within a trust region, then it’s an optimal solution b) Otherwise an optimal solution lies on the boundary. Of course the only difficult part is case b. In this case let’s rewrite $2 \times 2$ problem with the obvious change of notation and assuming $\Delta=1$:

$\displaystyle a x^2 + 2 b x y + c y^2 +2 d x + 2 f y \rightarrow \min_{x, y} \\ \text{ s. t. } x^2 + y^2 = 1$

To solve it we need to find stationary points of Lagrangian $L(x, y, \lambda) = a x^2 + 2 b x y + c y^2 +2 d x + 2 f y + \lambda (x^2 + y^2 - 1)$. Assigning partial derivatives to zeros, we come to the following system of equations:

$a x + b y + d + \lambda x = 0 \\ b x + c y + f + \lambda y = 0 \\ x^2 + y^2 = 1$

After eliminating $\lambda$ we get:

$b x^2 + (c - a) x y - b y^2 + f x - dy = 0 \\ x^2 + y^2 = 1$

To exclude the last equation let’s use parametrization $x = 2 t / (1 + t^2), y = (1 - t^2) / (1 + t^2)$. Then substitute it to the first equation and multiply by nonzero $(1 + t^2)^2$ to get (with a help of sympy):

$(-b + d) t^4 + 2 (a - c + f) t^3 + 6b t^2 + 2 (-a + c + f) t - b - d = 0$

And this is our final fourth-order algebraic equation (note how it’s symmetric is some sense). After finding all its roots, we discard complex roots, compute corresponding $x$ and $y$, substitute them in the original quadratic function and choose ones which give the smallest value. Originally I thought that this equation can’t have complex roots, but it didn’t confirm in practice.

Here is the code with my implementation. It contains the solver function and the function checking that the found solution is optimal according to the main optimality theorem for trust-region problems. (See my introductory post on least-squares algorithms.) Root-finding is done by numpy.roots, which I assume to be accurate and robust.

import numpy as np
from numpy.linalg import norm
from scipy.linalg import cho_factor, cho_solve, eigvalsh, orth, LinAlgError

def solve_2d_trust_region(B, g, Delta):
"""Solve a 2-dimensional general trust-region problem.

Parameters
----------
B : ndarray, shape (2, 2)
Symmetric matrix, defines a quadratic term of the function.
g : ndarray,, shape (2,)
Defines a linear term of the function.
Delta : float

Returns
-------
p : ndarray, shape (2,)
Found solution.
newton_step : bool
Whether the returned solution is Newton step which lies within
the trust region.
"""
try:
R, lower = cho_factor(B)
p = -cho_solve((R, lower), g)
if np.dot(p, p) &lt;= Delta**2:
return p, True
except LinAlgError:
pass

a = B[0, 0] * Delta ** 2
b = B[0, 1] * Delta ** 2
c = B[1, 1] * Delta ** 2

d = g[0] * Delta
f = g[1] * Delta

coeffs = np.array(
[-b + d, 2 * (a - c + f), 6 * b, 2 * (-a + c + f), -b - d])
t = np.roots(coeffs)  # Can handle leading zeros.
t = np.real(t[np.isreal(t)])

p = Delta * np.vstack((2 * t / (1 + t**2), (1 - t**2) / (1 + t**2)))
value = 0.5 * np.sum(p * B.dot(p), axis=0) + np.dot(g, p)
i = np.argmin(value)
p = p[:, i]

return p, False

def check_optimality(B, g, Delta, p, newton_step):
"""
Check if a trust-region solution optimal.

Optimal solution p satisfies the following conditions for some alpha &gt;= 0:

1. (B + alpha*I) * p = -g.
2. alpha * (||p|| - Delta) = 0.
3. B + alpha * I is positive semidefinite.

Returns
-------
alpha : float
Corresponding alpha value, must be non negative.
collinearity : float
Condition 1 check - norm((B + alpha * I) * p + g), must be very small.
complementarity : float
Condition 2 check - alpha * (norm(p) - Delta), must be very small.
pos_def : float
Condition 3 check - the minimum eigenvalue of B + alpha * I, must be
non negative.
"""
if newton_step:
alpha = 0.0
else:
q = B.dot(p) + g
i = np.argmax(np.abs(p))
alpha = -q[i] / p[i]

A = B + alpha * np.identity(2)
collinearity = norm(np.dot(A, p) + g)
complementarity = alpha * (Delta - norm(p))
pos_def = eigvalsh(A)[0]
return alpha, collinearity, complementarity, pos_def

def matrix_with_spectrum(eigvalues):
Q = orth(np.random.randn(eigvalues.size, eigvalues.size))
return np.dot(Q * eigvalues, Q)

def test_on_random(n_tests):
np.random.seed(0)
print(("{:&lt;20}" * 4).format(
"alpha", "collinearity", "complementarity", "pos. def."))
for i in range(n_tests):
eigvalues = np.random.randn(2)
B = matrix_with_spectrum(eigvalues)
g = np.random.randn(2)
Delta = 3.0 * np.random.rand(1)[0]
p, newton_step = solve_2d_trust_region(B, g, Delta)
print(("{:&lt;20.1e}" * 4).format(
*check_optimality(B, g, Delta, p, newton_step)))

if __name__ == '__main__':
test_on_random(10)


The output after running the script:

alpha               collinearity        complementarity     pos. def.
0.0e+00             1.1e-16             0.0e+00             4.0e-01
1.1e+00             9.2e-16             0.0e+00             6.0e-01
4.8e+01             5.0e-16             3.3e-16             4.7e+01
8.9e+00             4.4e-16             0.0e+00             1.0e+01
0.0e+00             3.1e-16             0.0e+00             1.2e+00
2.6e+00             1.1e-16             0.0e+00             2.4e+00
9.1e-01             4.4e-15             0.0e+00             1.1e-02
2.9e+00             2.2e-16             -3.2e-16            2.2e+00
1.6e+00             1.2e-16             -1.8e-16            7.0e-01
1.8e+00             8.0e-15             7.8e-16             5.2e-01


The figures tell us that all found solutions are optimal (see the docstring of check_optimality). So, provided we have a good function for root-finding, this approach is simple, robust and accurate.

### Goran Cetusic(GNS3)

#### GSOC GNS3 Docker support: The road so far

So midterm evaluations are ending soon and I'd like to write about my progress before that. If you remember my last update it was about how to write a new GNS3 module. Probably the biggest issue you'll run into is implementing links between between various nodes. This is because GNS3 is a jungle of different technologies, all with their own networking technologies. Implementing Docker links is no different.

Docker is different kind of virtualization than what GNS3 has been using until now -> OS-level virtualization. VMware, for instance uses full virtualization. You can read more about the difference on one of the million articles on the Internet. An important thing to note is that Docker uses namespaces to manage its network interfaces. More on this here: https://docs.docker.com/articles/networking/#container-networking. It's great, go read it!

GNS3 uses UDP tunnels for connecting its various VM technologies. This means that it after creating a network interface on the virtual machine, it allocates a UDP port on that interface. But this is REALLY not that easy to do in Docker because a lot of the virtualization technologies have UDP tunnels built in - Docker doesn't. Assuming you've read the article above, this is how it will work (still having trouble with it):

1.  Create a veth pair
2. Allocate UDP port on one end of veth pair
3. Wait for container to start and then push the other interface into container namespace
4. Connect interface to ubridge
If you're wondering what ubridge is -> it's a great little piece of technology that allows you to connect udp tunnels and interfaces. Hardly anyone's heard of it but GNS3 has been using it for their VMware machines for quite some time: https://github.com/GNS3/ubridge

The biggest problem with this is that this is all hidden deep inside GNS3 code which makes you constantly aske the question: "Where the hell should I override this??" Also, you have to take into consideration unforseen problems like the one I've mention earlier: You have to actually start the container in order to create the namespace and push the veth interface into it.

Another major problem that was solved is that Docker container require a running process without which they'll just terminate. I've decided to make an official Docker image to be used for Docker containers:  https://github.com/gcetusic/vroot-linux. It's not yet merged as part of GNS3. Basically, it uses a sleep command to act as a dummy init process and also installs packages like ip, tcpdump, netstat etc. It's a great piece of  code and you can use it independently of GNS3. In the future I expect there'll be a setting, something like "Startup command" so users will be able to use their own Docker images with their own init process.

It's been bumpy road so far, solving problems I haven't really thought about when I was writing the proposal but Docker support is slowly getting there.

## June 30, 2015

### Sudhanshu Mishra(SymPy)

#### GSoC'15: Mixing both assumption systems, Midterm updates

It's been very long since I've written anything here. Here's some of the pull requests that I've created during this period:

There's also this patch which makes changes in the Symbol itself to make this work.

commit de49998cc22c1873799539237d6202134a463956
Author: Sudhanshu Mishra <mrsud94@gmail.com>
Date:   Tue Jun 23 16:35:13 2015 +0530

Symbol creation adds provided assumptions to global assumptions

diff --git a/sympy/core/symbol.py b/sympy/core/symbol.py
index 3945fa1..45be26d 100644
--- a/sympy/core/symbol.py
+++ b/sympy/core/symbol.py
@@ -96,8 +96,41 @@ def __new__(cls, name, **assumptions):
False

"""
+        from sympy.assumptions.assume import global_assumptions
+
cls._sanitize(assumptions, cls)
-        return Symbol.__xnew_cached_(cls, name, **assumptions)
+        sym = Symbol.__xnew_cached_(cls, name, **assumptions)
+
+        items_to_remove = []
+        # Remove previous assumptions on the symbol with same name.
+        # Note: This doesn't check expressions e.g. Q.real(x) and
+        # Q.positive(x + 1) are not contradicting.
+        for assumption in global_assumptions:
+            if isinstance(assumption.arg, cls):
+                if str(assumption.arg) == name:
+                    items_to_remove.append(assumption)
+
+        for item in items_to_remove:
+            global_assumptions.remove(item)
+
+        for key, value in assumptions.items():
+            if not hasattr(Q, key):
+                continue
+            # Special case to handle commutative key as this is true
+            # by default
+            if key == 'commutative':
+                if not assumptions[key]:
+                continue
+
+            if value:
+            elif value is False:
+
+        return sym
+

def __new_stage2__(cls, name, **assumptions):
if not isinstance(name, string_types):

Master

In [1]: from sympy import *
In [2]: %time x = Symbol('x', positive=True, real=True, integer=True)
CPU times: user 233 µs, sys: 29 µs, total: 262 µs
Wall time: 231 µs

This branch

In [1]: from sympy import *
In [2]: %time x = Symbol('x', positive=True, real=True, integer=True)
CPU times: user 652 µs, sys: 42 µs, total: 694 µs
Wall time: 657 µs


I did a small benchmarking by creating 100 symbols and setting assumptions over them and then later asserting them. It turns out that the one with changes in the ask handers is performing better than the other two.

Here's the report of the benchmarking:

## When Symbol is modified

Line #    Mem usage    Increment   Line Contents
================================================
6     30.2 MiB      0.0 MiB   @profile
7                             def mem_test():
8     30.5 MiB      0.3 MiB       _syms = [Symbol('x_' + str(i), real=True, positive=True) for i in range(1, 101)]
9     34.7 MiB      4.2 MiB       for i in _syms:
10     34.7 MiB      0.0 MiB           assert ask(Q.positive(i)) is True


pyinstrument report

## When ask handlers are modified

Line #    Mem usage    Increment   Line Contents
================================================
6     30.2 MiB      0.0 MiB   @profile
7                             def mem_test():
8     30.4 MiB      0.2 MiB       _syms = [Symbol('x_' + str(i), real=True, positive=True) for i in range(1, 101)]
9     31.5 MiB      1.1 MiB       for i in _syms:
10     31.5 MiB      0.0 MiB           assert ask(Q.positive(i)) is True


pyinstrument report

## When satask handlers are modified

Line #    Mem usage    Increment   Line Contents
================================================
6     30.2 MiB      0.0 MiB   @profile
7                             def mem_test():
8     30.4 MiB      0.2 MiB       _syms = [Symbol('x_' + str(i), real=True, positive=True) for i in range(1, 101)]
9     41.1 MiB     10.7 MiB       for i in _syms:
10     41.1 MiB      0.0 MiB           assert ask(Q.positive(i)) is True


pyinstrument report

On the other hand, the documentation PR is almost ready to go.

As of now I'm working on fixing the inconsistencies between the two assumption systems. After that I'll move to reduce autosimplification based on the assumptions in the core.

That's all for now. Cheers!

### Palash Ahuja(pgmpy)

#### Inference in Dynamic Bayesian Network (continued)

For the past 2 weeks I have spent some time understanding the algorithmic implementation for inference and implementing it. Today I will be talking about the junction tree algorithm for inference in Dynamic Bayesian Networks.

For processing the algorithm, here are the following steps
1) Initialization :- This requires constructing the two initial junction trees J1 and Jt.
1. J1 is the junction tree created from the initial timeslice. is the junction tree created from the timeslice 1 of the 2TBN(2 - timeslice bayesian network).Jt is the junction tree created from the timeslice 2 of the 2TBN. Time counter is initialized to 0. Also, let the interface nodes(denoted by I1, I2 for the timeslices 1 and 2 respectively ) be those nodes whose children are there in the first timeslice.
2. If the queries are performed on the initial timeslice. Then the results can be output by the standard VariableElimination procedure where we could have the model having the timeslice 1 of the bayesian network as the base for inference.
3. For evidence, if the current time in the evidence is 0, then the evidence should be applied to the initial static bayesian network. Otherwise, it has to be applied to the second timeslice of the 2-TBN.
4. For creating the junction tree J1, the procedure as follows:-
1. Moralize the initial static bayesian network.
2. Add the edges from the interface nodes so as to make I1 a clique.
3. Rest of the procedure is the same as it was before. The above step is the only difference.
5. For the junction tree Jt, a similar procedure is followed, where there is a clique formed for I2 as well.
2) Inference procedure:- In this procedure, the clique potential from the interface nodes is passed onto the interface clique. (similar to the message passing algorithm).
The time counter is incremented accordingly.
So basically the junction tree Jt seems some sort of the engine where the in-clique is where the values are supplied and the out-clique is where the values are obtained, given the e
The variables in the query are taken out as always at each step, and the evidence is applied also.
The best part about this procedure, that this method eliminates entanglement and only the out-clique potential is required for inference.
The implementation is still in progress.

### Vivek Jain(pgmpy)

I worked on ProbModelXML reader and writer module for this project.My Project involved solving various bugs which were present in the module. It also involved solving the various TODO's to be done. Some of TODO's are
Decision Criteria :
The tag DecisionCriteria is used in multicriteria decision making. as follows:

<DecisionCriteria>

<Criterion name = string >
</Criterion>2..n
</DecisionCriteria>

Potential :
The tag DecisionCriteria is used in multicriteria decision making. as follows:

<Potential type="" name="">

<Variablies >
<Variable name="string"/>
</Variables>
<Values></Values>
</Potential>
My project involved parsing the above type of XML for the reader module.

For writer class my project involved given an instance of Bayesian Model, create a probmodelxml file of that given Bayesian Model.

### Julio Ernesto Villalon Reina(Dipy)

Hi all,

I mentioned before that I was at a conference meeting (Organization of Human Brain Mapping, 2015 http://ohbm.loni.usc.edu/) where I had the great chance to meet with my mentors. Now, it's time to update on what was done during those days and during the week after (last week).
As stated in my proposal, the project consists of classifying a brain T1 MRI into “tissue classes” and estimating the partial volume at the boundary between those tissues. Consequently, this is a brain segmentation problem. We decided to use a segmentation method based on Markov Random Field modeling, specifically the Maximum a Posteriori MRF approach (MAP-MRF). The implementation of a MAP-MRF estimation for brain tissue segmentation is based on the Expectation Maximization (EM) algorithm, as described in Zhang et al. 2001 ("Segmentation of brain MR images through a hidden Markov random field model and the expectation-maximization algorithm," Medical Imaging, IEEE Transactions on, vol.20, no.1, pp.45,57, Jan 2001). The maximization step is performed using the Iterative Conditional Modes (ICM) algorithm. Thus, together with my mentors, we decided to first work on the ICM algorithm. I started working on it during the Hackathon at OHBM and finished it up last week. It is working now and I already shared it publicly to the rest of the DIPY team. I submitted my first pull request called:

WIP: Tissue classification using MAP-MRF
https://github.com/nipy/dipy/pull/670#partial-pull-merging

There was a lot of feedback from all the team, especially regarding how to make it faster. The plan for this week is to include the EM on top of the ICM and provide the first Partial Volume Estimates. Will do some testing and validation of the method to see how it performs compared to other publicly available methods such as FAST from FSL (http://fsl.fmrib.ox.ac.uk/fsl/fslwiki/FAST).

## June 29, 2015

### Wei Xue(Scikit-learn)

#### GSoC Week 5

The week 5 began with a discussion with whether we should deprecate params. I fixed some bugs in checking functions, random number generator and one of covariance updating methods. In the following days, I completed the main functions of GaussianMixutre and all test cases, except AIC, BIC and sampling functions. The tests are some kind of challenging, sine the current implementation in the master branch contains very old test cases imported from Weiss's implementation which is never got improved. I simplified the test cases, and wrote more tests that are not covered by the current implementation, such as covariance estimation, ground truth parameter prediction, and other user-friendly warnings and errors.

Next week, I will begin to code BayesianGaussianMixture.

### Mark Wronkiewicz(MNE-Python)

#### Bug Hunt

C-Day plus 34

For over a week now, the name of the game has been bug hunting. I had a finished first draft since the last blog post, so I’ve been trying to get the output of my custom SSS filter to match the proprietary version with sample data. One issue that took a couple days to track down was a simple but erroneous switch of two angles in a complex spherical coordinate gradient to Cartesian coordinate gradient transformation matrix. I can’t say that this is a new class of obstacles – spherical coordinates have thrown wrench after wrench into my code since different mathematicians regularly define these coordinates in different ways. (Is it just me, or is having seven separately accepted conventions for the spherical coordinate system a bit absurd?) My project crosses a couple domains of mathematics, so wrestling with these different conventions has helped me deeply appreciate the other mathematical concepts that do have a single accepted formulation.

Regardless, weeding out the spherical coordinate issue and a menagerie of other bugs has left me with a filter that produces filtered data that is similar to (but not exactly matching) the proprietary code (see some example output below). Luckily, I do have several checkpoints in the filter’s processing chain and I know the problem is between the last checkpoint and the final output. My mentors have been fantastic so far, and we have a potential bead on the last issue; the weak magnetic signals produced by the brain are measured with two major classes of MEG pickup coils: magnetometers and gradiometers. In a very simple sense, one measures the magnetic field while the other measures the spatial derivative of the magnetic field, and (because of this difference) they provide readings on very different scales that I have yet to normalize. Given some luck, this last patch could fix the issue and yield a working solution to the first half of my GSoC project! (Knock on wood.)

 Exemplar data showing raw unfiltered MEG signal and the same data after the benchmark SSS filter and my own custom filtering (top). Difference between benchmark and my custom implementation (bottom). The filter in progress is close, but not quite the same as the benchmark implying there remains some bugs to fix.

## June 28, 2015

### Stefan Richthofer(Jython)

#### Midterm evaluation

The midterm-evaluation milestone is as follows:
Have JyNI detect and break reference-cycles in native objects backed by Java-GC. This must be done by Java-GC in order to deal with interfering non-native PyObjects. Further this functionality must be monitorable, so that it can transparently be observed and confirmed.

## Sketch of some issues

The issues to overcome for this milestone were manifold:
• The ordinary reference-counting for scenarios that actually should work without GC contained a lot of bugs in JyNI C-code. This had to be fixed. When I wrote this code initially, the GC-concept was still an early draft and in many scenarios it was unclear whether and how reference-counting should be applied. Now all this needed to be fixed (and there are probably still remaining issues of this type)
• JNI defines a clear policy how to deal with provided jobject-pointers. Some of them must be freed explicitly. On the other hand some might be freed implicitly by the JVM - without your intention, if you don't get it right. Also on this front vast clean-up in JyNI-code was needed, also to avoid immortal trash.
• JyNI used to keep alive Java-side-PyObjects that were needed by native objects indefinitely.
Now these must be kept alive by the Java-copy of the native reference-graph instead. It was hard to make this mechanism sufficiently robust. Several bugs caused reference-loss and had to be found to make the entire construct work. On the other hand some bugs also caused hard references to persist, which kept Java-GC from collecting the right objects and triggering JyNI's GC-mechanism.
• Issues with converting self-containing PyObjects between native side and Java-side had to be solved. These were actually bugs unrelated to GC, but still had to be solved to achieve the milestone.
• A mechanism to monitor native references from Java-side, especially their malloc/free actions had to be established.
Macros to report these actions to Java/JyNI were inserted into JyNI's native code directly before the actual calls to malloc or free. What made this edgy is the fact that some objects are not freed by native code (which was vastly inherited from CPython 2.7), but cached for future use (e.g. one-letter strings, small numbers, short tuples, short lists). Acquiring/returning an object from/to such a cache is now also reported as malloc/free, but specially flagged. For all these actions JyNI records timestamps and maintains a native object-log where one can transparently see the lifetime-cycle of each native object.
• The original plan to explore native object's connectivity in the GC_Track-method is not feasible because for tuples and lists this method is usually called before the object is populated.
JyNI will have a mechanism to make it robust of invalid exploration-attempts, but this mechanism should not be used for normal basic operation (e.g. tuple-allocation happens for every method-call) but only for edgy cases, e.g. if an extension defines its own types, registers instances of them in JyNI-GC and then does odd stuff with them.
So now GC_track saves objects in a todo-list regarding exploration and actual exploration is performed at some critical JyNI-operations like on object sync-on-init or just before releasing the GIL. It is likely that this strategy will have to be fine-tuned later.

## Proof of the milestone

To prove the achievement of the explained milestone I wrote a script that creates a reference-cycle of a tuple and a list such that naive reference-counting would not be sufficient to break it. CPython would have to make use of its garbage collector to free the corresponding references.
1. I pass the self-containing tuple/list to a native method-call to let JyNI create native counterparts of the objects.
2. I demonstrate that JyNI's reference monitor can display the corresponding native objects ("leaks" in some sense).
3. The script runs Java-GC and confirms that it collects the Jython-side objects (using a weak reference).
4. JyNI's GC-mechanism reports native references to clear. It found them, because the corresponding JyNI GC-heads were collected by Java-GC.
5. Using JyNI's reference monitor again, I confirm that all native objects were freed. Also those in the cycle.

### The GC demonstration-script

import time
from JyNI import JyNI
from JyNI import JyReferenceMonitor as monitor
from JyNI.gc import JyWeakReferenceGC
from java.lang import System
from java.lang.ref import WeakReference
import DemoExtension

#Note:
# For now we attempt to verify JyNI's GC-functionality independently from
# Jython concepts like Jython weak references or Jython GC-module.
# So we use java.lang.ref.WeakReference and java.lang.System.gc
#
to monitor and control Java-gc.

JyNI.JyRefMonitor_setMemDebugFlags(1)
JyWeakReferenceGC.monitorNativeCollection = True

l = (123, [0, "test"])
l[1][0] = l
#We create weak reference to l to monitor collection by Java-GC:
wkl = WeakReference(l)
print "weak(l): "+str(wkl.get())

# We pass down l to some native method. We don't care for the method itself,
# but conversion to native side causes creation of native PyObjects that
# correspond to l and its elements. We will then track the life-cycle of these.
print "make l native..."
DemoExtension.argCountToString(l)

print "Delete l... (but GC not yet ran)"
del l
print "weak(l) after del: "+str(wkl.get())
print ""
# monitor.list-methods display the following format:
# [native pointer]{'' | '_GC_J' | '_J'} ([type]) #[native ref-count]: [repr] *[creation time]
# _GC_J means that JyNI tracks the object
# _J means that a JyNI-GC-head exists, but the object is not actually treated by GC
# This can serve monitoring purposes or soft-keep-alive (c.f. java.lang.ref.SoftReference)
# for caching.
print "Leaks before GC:"
monitor.listLeaks()
print ""

# By inserting this line you can confirm that native
# leaks would persist if JyNI-GC is not working:
#JyWeakReferenceGC.nativecollectionEnabled = False

print "calling Java-GC..."
System.gc()
time.sleep(2)
print "weak(l) after GC: "+str(wkl.get())
print ""
monitor.listWouldDeleteNative()
print ""
print "leaks after GC:"
monitor.listLeaks()

print ""
print "===="
print "exit"
print "===="

It is contained in JyNI in the file JyNI-Demo/src/JyNIRefMonitor.py

### Instructions to reproduce this evaluation

1. You can get the JyNI-sources by calling
git clone https://github.com/Stewori/JyNI
Switch to JyNI-folder:
cd JyNI
2. (On Linux with gcc) edit the makefile (for OSX with llvm/clang makefile.osx) to contain the right paths for JAVA_HOME etc. You can place a symlink to jython.jar (2.7.0 or newer!) in the JyNI-folder or adjust the Jython-path in makefile.
3. Run make (Linux with gcc)
(for OSX with clang use make -f makefile.osx)
4. To build the DemoExtension enter its folder:
cd DemoExtension
and run setup.py:
python setup.py build
cd ..
5. Confirm that JyNI works:
./JyNI_unittest.sh
6. ./JyNI_GCDemo.sh

### Discussion of the output

Running JyNI_GCDemo.sh:

JyNI: memDebug enabled!
weak(l): (123, [(123, [...]), 'test'])
make l native...
Delete l... (but GC not yet ran)
weak(l) after del: (123, [(123, [...]), 'test'])

Leaks before GC:
Current native leaks:
139971370108712_GC_J (list) #2: "[(123, [...]), 'test']" *28
139971370123336_J (str) #2: "test" *28
139971370119272_GC_J (tuple) #1: "((123, [(123, [...]), 'test']),)" *28
139971370108616_GC_J (tuple) #3: "(123, [(123, [...]), 'test'])" *28

calling Java-GC...
weak(l) after GC: None

Native delete-attempts:
139971370108712_GC_J (list) #0: -jfreed- *28
139971370123336_J (str) #0: -jfreed- *28
139971370119272_GC_J (tuple) #0: -jfreed- *28
139971370108616_GC_J (tuple) #0: -jfreed- *28

leaks after GC:
no leaks recorded

====
exit
====
Let's briefly discuss this output. We created a self-containing tuple called l. To allow it to self-contain we must put a list in between. Using a Java-WeakReference, we confirm that Java-GC collects our tuple. Before that we let JyNI's reference monitor print a list of native objects that are currently allocated. We refer to them as "leaks", because all native calls are over and there is no obvious need for natively allocated objects now. #x names their current native ref-count. It explains as follows (observe that it contains a cycle):
139971370108712_GC_J (list) #2: "[(123, [...]), 'test']"

This is l[1]. One reference is from JyNI to keep it alive, the second one is from l.

139971370123336_J (str) #2: "test"

This is l[1][1]. One reference is from JyNI to keep it alive, the second one is from l[1].

139971370119272_GC_J (tuple) #1: "((123, [(123, [...]), 'test']),)"
This is the argument-tuple that was used to pass l to the native method. The reference is from JyNI to keep it alive.
139971370108616_GC_J (tuple) #3: "(123, [(123, [...]), 'test'])"
This is l. One reference is from JyNI to keep it alive, the second one is from the argument-tuple (139971370108616)and the third one is from l[1]. Thus it builds a reference-cycle with l[1].

After running Java-GC (and giving it some time to finnish) we confirm that our weak reference to l was cleared. And indeed, JyNI's GC-mechanism reported some references to clear, all reported leaks among them. Finally another call to JyNI's reference monitor does not list leaks any more.

### Check that this behavior is not self-evident

In JyNI-Demo/src/JyNIRefMonitor.py go to the section:

# By inserting this line you can confirm that native
# leaks would persist if JyNI-GC is not working:
#JyWeakReferenceGC.nativecollectionEnabled = False

Change it to
# By inserting this line you can confirm that native
# leaks would persist if JyNI-GC is not working:

JyWeakReferenceGC.nativecollectionEnabled = False

Run JyNI_GCDemo.sh again. You will notice that the native leaks persist.

### Next steps

The mechanism currently does not cover all native types. While many should already work I expect that some bugfixing and clean-up will be required to make this actually work. With the demonstrated reference-monitor-mechanism