The Kepler Space Observatory is a NASA-build satellite that was launched in 2009. The telescope is dedicated to searching for exoplanets in star systems besides our own, with the ultimate goal of possibly finding other habitable planets besides our own.

I'm plotting stellar metallicity against planetary radius for planets located in the Goldilocks zone—regions around stars where conditions may be just right to support life. I’ve labeled each point with the corresponding Kepler object name and categorized the data by planet type using a color palette, which is mapped to the 'Planet-Type' variable. Additionally, I've indicated that the clustering performed by the K-Means algorithm predicts the groupings within the scatter plot. This can give insights into how the metallicity of a star can influence the characteristics of potential Earth-like planets in its orbit.

I've implemented a function in Python to clean a dataset. The function clean_dataset takes a pandas DataFrame and replaces infinite values with NaNs, which are then dropped from the dataset. The result is that only rows with all finite values are kept. This is crucial for ensuring the data is suitable for analysis or modeling without errors due to infinite or NaN values.

This is an illustration of the star surface gravity versus star radius, where I have used a scatter plot to visualize the data. The colors represent different stellar classifications (O, B, A, F, G, K, M), and the size of the dots corresponds to the star radius. I have noted that surface gravity decreases with increasing radius, which matches our understanding of celestial mechanics. The stars with the lowest surface gravity and the largest radius are marked with the reddest dots, suggesting they could be red supergiants.

This is a scatter plot that displays the relationship between stellar mass and radius for different categories of Kepler Objects of Interest (KOIs): CONFIRMED, CANDIDATE, and FALSE POSITIVE. The plot suggests a positive correlation between stellar mass and radius, particularly for confirmed cases. The color scheme represents different categories, while the size of each point correlates with the stellar surface temperature (denoted by 'koi_steff'), indicating that larger, redder dots represent hotter stars. I’ve used the seaborn library for plotting and labeled the axes with appropriate units.