The revolution of power grids from traditional grids to Smart Grids (SGs) requires effective Demand Side Management (DSM) and reliable Renewable Energy Sources (RESs) incorporation in order to maintain demand, supply balance and optimize energy in an environment friendly manner. Data analytics provide solutions to the emerging challenges of power systems, such as DSM, environmental pollution (due to carbon emission), fossil fuel dependency mitigation, RESs incorporation, cost curtailment, grid’s stability and security. To efficiently manage electricity and maximize the profit of power utilities several tasks are focused in this thesis, i.e., prediction of electricity load to avoid demand and generation mismatch, wind power forecasting to satisfy energy demand effectively, electricity price forecasting for regulating market operations, carbon emissions forecasting for reducing payment of carbon tax, Electricity Theft Detection (ETD) for recovering power utilities’ revenue loss caused by electricity theft. In addition to that, a wind power forecast based DSM scheme is proposed. Furthermore, impact of RESs integration level on carbon emissions, electricity price and consumption cost is quantified. Both forecasting and classification techniques are utilized for efficient energy management. Forecasting of electricity load, price, wind power and carbon emissions is performed, whereas, classification of fair and fraudulent electricity consumers is performed. To balance electricity demand and supply, electricity load forecasting is required. Three models are proposed for this purpose, i.e., Deep Long Short-Term Memory (DLSTM), Efficient Sparse Autoencoder Nonlinear Autoregressive eXogenous network (ESAENARX) and Differential Evolution Recurrent Extreme Learning Machine (DE-RELM). DLSTM utilizes univariate data and gives single result, whereas, ESAENARX and DE-RELM model multivariate data and predict electricity load and price simultaneously. Due to adaptive and automatic feature learning mechanism, DLSTM achieves accurate results for separate forecasting of electricity load and price. ESAENARX and DE-RELM models are enhanced by newly proposed efficient feature extractor and model’s parameter tuning, respectively. Real-world datasets of ISO-NE, PJM, NYISO are used for load and price forecasting. The purpose of regulating the electricity market operations is achieved by forecasting of electricity load, price, wind power and carbon emissions. Wind power generation is predicted by an efficient model named Efficient Deep Convolution Neural Network (EDCNN). Moreover, a DSM strategy is also proposed based on predicted wind power generation. Power utilities have to pay carbon emissions tax imposed by government. To pay less carbon emissions tax, carbon emissions prediction is required, which helps in encouraging electricity consumers to shift their consumption load to low carbon price time periods of the day. For accomplishing the carbon emissions forecasting task, an efficient model named as Improved Particle Swarm Optimization based Deep Neural Network (IPSO DNN) is proposed. This model is improved by tunning the parameters of DNN by newly proposed improved optimization technique named as IPSO. ISO-NE dataset is used for wind power and carbon emissions forecasting. To reduce the financial loss of power utilities ETD is very important. For this purpose four models are proposed, named as, Differential Evolution Random Under Sampling Boosting (DE-RUSBoost), Jaya-RUSBoost, RUS Ensemble CNN (RUSE-CNN) and anomaly detection based ETD. In DE-RUSBoost and Jaya-RUSBoost, the parameters of RUSBoost classifier are tunned by DE and Jaya optimization techniques, respectively. In RUSE-CNN, RUS data balancing technique is applied along with ensemble CNN to improve ETD performance. DE-RUSBoost, Jaya-RUSBoost and RUSE-CNN are supervised model that work on labeled electricity theft data. Whereas, anomaly detection based ETD model is capable of identifying electricity theft from unlabeled electricity consumption data. Real-world datasets of SGCC, UMass, PRECON, CER, EnerNOC and LCL are used for ETD. Simulation results show that all the proposed models perform significantly better on real-world dataset as compared to their state-of-the-art counterpart models. The improved feature engineering and model hyper-parameter tuning enhance the performance of the proposed models in terms of prediction and classification results.